BatchCorrectedCounts {CountClust}R Documentation

Obtain Batch effect Corrected counts

Description

This function first converts counts data to log CPM data , then apply a linear model with the batch effect as a factor. We take the sum of intercept, residuals and mean batch effect across all the batches and then inverse transform it back to counts to get rid of batch effects.

Usage

BatchCorrectedCounts(data, batch_lab, use_parallel = TRUE)

Arguments

data

count matrix, with samples along the rows and features along the columns.

batch_lab

batch label vector.

use_parallel

if TRUE, we do a parallel analysis over features, else serial application.

Value

Returns a counts data. with same dimension as the input data, but which is corrected for batch_lab.

Examples


# Simulation example
N=500;
K=4;
G=100;
Label.Batch=c(rep(1,N/4),rep(2,N/4),rep(3,N/4),rep(4,N/4));
alpha_true=matrix(rnorm((K)*G,0.5,1),nrow=(K));
library(gtools)
tt <- 10;
omega_true = matrix(rbind(rdirichlet(tt*10,c(3,4,2,6)),
                         rdirichlet(tt*10,c(1,4,6,3)),
                         rdirichlet(tt*10,c(4,1,2,2)),
                         rdirichlet(tt*10,c(2,6,3,2)),
                         rdirichlet(tt*10,c(3,3,5,4))), nrow=N);
B=max(Label.Batch);
sigmab_true=2;
beta_true=matrix(0,B,G);
for(g in 1:G)
{
    beta_true[,g]=rnorm(B,mean=0,sd=sigmab_true);
}
read_counts=matrix(0,N,G);
for(n in 1:N){
    for(g in 1:G)
    {
        read_counts[n,g]=rpois(1, omega_true[n,]%*%exp(alpha_true[,g]
                                                      + beta_true[Label.Batch[n],g]));
   }
}

batchcorrect_counts <- BatchCorrectedCounts(read_counts, Label.Batch,
                                     use_parallel=FALSE)


[Package CountClust version 1.23.1 Index]