BASiCS_MCMC {BASiCS} | R Documentation |
BASiCS MCMC sampler
Description
MCMC sampler to perform Bayesian inference for single-cell
mRNA sequencing datasets using the model described in Vallejos et al (2015).
Usage
BASiCS_MCMC(Data, N, Thin, Burn, Regression, ...)
Arguments
Data |
A SingleCellExperiment object.
This MUST be formatted to include the spike-ins information (see vignette).
|
N |
Total number of iterations for the MCMC sampler.
Use N>=max(4,Thin) , N being a multiple of Thin .
|
Thin |
Thining period for the MCMC sampler. Use Thin>=2 .
|
Burn |
Burn-in period for the MCMC sampler. Use Burn>=1 ,
Burn<N , Burn being a multiple of Thin .
|
Regression |
If Regression = TRUE , BASiCS exploits a joint prior
formulation for mean and over-dispersion parameters to estimate a measure of
residual over-dispersion is not confounded by mean expression. Recommended
setting is Regression = TRUE
|
... |
Optional parameters.
PriorDelta Specifies the prior used for delta .
Possible values are 'gamma' (Gamma(a.theta ,b.theta ) prior) and
'log-normal' (log-Normal(0 ,s2.delta ) prior) .
.
Default value: PriorDelta = 'log-normal' .
PriorParam List of 7 elements, containing the hyper-parameter
values required for the adopted prior (see Vallejos et al, 2015, 2016).
All elements must be positive real numbers.
s2.mu Scale hyper-parameter for the
log-Normal(0 ,s2.mu ) prior that is shared by all
gene-specific expression rate parameters μ_i.
Default: s2.mu = 0.5 .
s2.delta Only used when ‘PriorDelta == ’log-normal''.
Scale hyper-parameter for the log-Normal(0 ,s2.delta )
prior that is shared by all gene-specific over-dispersion parameters
δ_i. Default: s2.delta = 0.5 .
a.delta Only used when ‘PriorDelta == ’gamma''.
Shape hyper-parameter for the Gamma(a.delta ,b.delta )
prior that is shared by all gene-specific biological over-dispersion
parameters δ_i. Default: a.delta = 1 .
b.delta Only used when ‘PriorDelta == ’gamma''.
Rate hyper-parameter for the Gamma(a.delta ,b.delta )
prior that is shared by all gene-specific biological over-dispersion
hyper-parameters δ_i. Default: b.delta = 1 .
p.phi Dirichlet hyper-parameter for the joint of all
(scaled by n ) cell-specific mRNA content normalising
constants φ_j / n.
Default: p.phi = rep(1, n) .
a.s Shape hyper-parameter for the
Gamma(a.s ,b.s ) prior that is shared by all
cell-specific capture efficiency normalising constants s_j.
Default: a.s = 1 .
b.s Rate hyper-parameter for the Gamma(a.s ,
b.s ) prior that is shared by all cell-specific capture
efficiency normalising constants s_j.
Default: b.s = 1 .
a.theta Shape hyper-parameter for the
Gamma(a.theta ,b.theta ) prior for technical noise
parameter θ. Default: a.theta = 1 .
b.theta Rate hyper-parameter for the
Gamma(a.theta ,b.theta ) prior for technical noise
parameter θ. Default: b.theta = 1 .
eta Only used when Regression = TRUE . eta
specifies the degress of freedom for the residual term.
Default: eta = 5 .
.
WithSpikes If WithSpikes = FALSE , the no-spikes
model will be fitted. Default: WithSpikes = TRUE .
k Only used when Regression = TRUE . k specifies
the number of regression Gaussian Radial Basis Functions (GRBF) used
within the correlated prior adopted for gene-specific over-dispersion
and mean expression paramters. Default: k = 12 .
Var Only used when Regression = TRUE . Var
specifies the GRBF scaling parameter. Default: Var = 1.2 .
AR Optimal acceptance rate for adaptive Metropolis Hastings
updates. It must be a positive number between 0 and 1. Default
(and recommended): AR = 0.44
.
StopAdapt Iteration at which adaptive proposals are not longer
adapted. Use StopAdapt>=1 . Default: StopAdapt = Burn .
StoreChains If StoreChains = TRUE , the generated
BASiCS_Chain object is stored as a '.Rds' file (RunName
argument used to index the file name).
Default: StoreChains = FALSE .
StoreAdapt If StoreAdapt = TRUE , trajectory of
adaptive proposal variances (in log-scale) for all parameters is
stored as a list in a '.Rds' file (RunName argument used to
index file name). Default: StoreAdapt = FALSE .
StoreDir Directory where output files are stored.
Only required if StoreChains = TRUE and/or
StoreAdapt = TRUE ). Default: StoreDir = getwd() .
RunName String used to index '.Rds' files storing chains
and/or adaptive proposal variances.
PrintProgress If PrintProgress = FALSE , console-based
progress report is suppressed.
Start Starting values for the MCMC sampler. We do not advise
to use this argument. Default options have been tuned to facilitate
convergence. If changed, it must be a list containing the following
elements: mu0 , delta0 , phi0 , s0 ,
nu0 , theta0 , ls.mu0 , ls.delta0 ,
ls.phi0 , ls.nu0 and ls.theta0
|
Value
An object of class BASiCS_Chain
.
Author(s)
Catalina A. Vallejos cnvallej@uc.cl
Nils Eling eling@ebi.ac.uk
References
Vallejos, Marioni and Richardson (2015). PLoS Computational Biology.
Vallejos, Richardson and Marioni (2016). Genome Biology.
Eling et al (2017). bioRxiv
Examples
# Built-in simulated dataset
Data <- makeExampleBASiCS_Data()
# To analyse real data, please refer to the instructions in:
# https://github.com/catavallejos/BASiCS/wiki/2.-Input-preparation
# Only a short run of the MCMC algorithm for illustration purposes
# Longer runs migth be required to reach convergence
Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 10, Regression = FALSE,
PrintProgress = FALSE)
# To run the regression version of BASiCS, use:
Chain <- BASiCS_MCMC(Data, N = 50, Thin = 2, Burn = 10, Regression = TRUE,
PrintProgress = FALSE)
# For illustration purposes we load a built-in 'BASiCS_Chain' object
# (obtained using the 'BASiCS_MCMC' function)
data(ChainSC)
# `displayChainBASiCS` can be used to extract information from this output.
# For example:
head(displayChainBASiCS(ChainSC, Param = 'mu'))
# Traceplot (examples only)
plot(ChainSC, Param = 'mu', Gene = 1)
plot(ChainSC, Param = 'phi', Cell = 1)
plot(ChainSC, Param = 'theta', Batch = 1)
# Calculating posterior medians and 95% HPD intervals
ChainSummary <- Summary(ChainSC)
# `displaySummaryBASiCS` can be used to extract information from this output
# For example:
head(displaySummaryBASiCS(ChainSummary, Param = 'mu'))
# Graphical display of posterior medians and 95% HPD intervals
# For example:
plot(ChainSummary, Param = 'mu', main = 'All genes')
plot(ChainSummary, Param = 'mu', Genes = 1:10, main = 'First 10 genes')
plot(ChainSummary, Param = 'phi', main = 'All cells')
plot(ChainSummary, Param = 'phi', Cells = 1:5, main = 'First 5 cells')
plot(ChainSummary, Param = 'theta')
# To constrast posterior medians of cell-specific parameters
# For example:
par(mfrow = c(1,2))
plot(ChainSummary, Param = 'phi', Param2 = 's', SmoothPlot = FALSE)
# Recommended for large numbers of cells
plot(ChainSummary, Param = 'phi', Param2 = 's', SmoothPlot = TRUE)
# To constrast posterior medians of gene-specific parameters
par(mfrow = c(1,2))
plot(ChainSummary, Param = 'mu', Param2 = 'delta', log = 'x',
SmoothPlot = FALSE)
# Recommended
plot(ChainSummary, Param = 'mu', Param2 = 'delta', log = 'x',
SmoothPlot = TRUE)
# Highly and lowly variable genes detection (within a single group of cells)
DetectHVG <- BASiCS_DetectHVG(ChainSC, VarThreshold = 0.60,
EFDR = 0.10, Plot = TRUE)
DetectLVG <- BASiCS_DetectLVG(ChainSC, VarThreshold = 0.40,
EFDR = 0.10, Plot = TRUE)
plot(ChainSummary, Param = 'mu', Param2 = 'delta', log = 'x', col = 8)
with(DetectHVG$Table, points(Mu[HVG == TRUE], Delta[HVG == TRUE],
pch = 16, col = 'red', cex = 1))
with(DetectLVG$Table, points(Mu[LVG == TRUE], Delta[LVG == TRUE],
pch = 16, col = 'blue', cex = 1))
# If variance thresholds are not fixed
BASiCS_VarThresholdSearchHVG(ChainSC,
VarThresholdsGrid = seq(0.55,0.65,by=0.01),
EFDR = 0.10)
BASiCS_VarThresholdSearchLVG(ChainSC,
VarThresholdsGrid = seq(0.35,0.45,by=0.01),
EFDR = 0.10)
# To obtain denoised rates / counts, see:
help(BASiCS_DenoisedRates)
help(BASiCS_DenoisedCounts)
# For examples of differential analyses between 2 populations of cells see:
help(BASiCS_TestDE)
[Package
BASiCS version 1.2.1
Index]