collapseBatch {CNPBayes}R Documentation

Estimate batch from a collection of chemistry plates or some other variable that captures the time in which the arrays were processed.

Description

In high-throughput assays, low-level summaries of copy number at copy number polymorphic loci (e.g., the mean log R ratio for each sample, or a principal-component derived summary) often differ between groups of samples due to technical sources of variation such as reagents, technician, or laboratory. Technical (as opposed to biological) differences between groups of samples are referred to as batch effects. A useful surrogate for batch is the chemistry plate on which the samples were hybridized. In large studies, a Bayesian hierarchical mixture model with plate-specific means and variances is computationally prohibitive. However, chemistry plates processed at similar times may be qualitatively similar in terms of the distribution of the copy number summary statistic. Further, we have observed that some copy number polymorphic loci exhibit very little evidence of a batch effect, while other loci are more prone to technical variation. We suggest combining plates that are qualitatively similar in terms of the Kolmogorov-Smirnov two-sample test of the distribution and to implement this test independently for each candidate copy number polymophism identified in a study. The collapseBatch function is a wrapper to the ks.test implemented in the stats package that compares all pairwise combinations of plates. The ks.test is performed recursively on the batch variables defined for a given CNP until no batches can be combined.

Usage

collapseBatch(object, provisional_batch, THR = 0.1)

## S4 method for signature 'MultiBatchModel'
collapseBatch(object)

## S4 method for signature 'SummarizedExperiment'
collapseBatch(object, provisional_batch,
  THR = 0.1)

## S4 method for signature 'numeric'
collapseBatch(object, provisional_batch, THR = 0.1)

Arguments

object

see showMethods(collapseBatch)

provisional_batch

a vector labelling from which batch each observation came from.

THR

threshold below which the null hypothesis should be rejected and batches are collapsed.

Value

The new batch value.

Examples

mb.ex <- MultiBatchModelExample
batches <- batch(mb.ex)
bt <- collapseBatch(y(mb.ex), batches)
batches <- as.integer(factor(bt))
hp <- hpList(k=k(mb.ex))[["MB"]]
model <- MB(dat=y(mb.ex),
            hp=hp,
            batch=batches,
            mp=mcmcParams(mb.ex))

[Package CNPBayes version 1.12.0 Index]