dcGSA {dcGSA}R Documentation

Perform gene set analysis for longitudinal gene expression profiles.

Description

Perform gene set analysis for longitudinal gene expression profiles.

Usage

dcGSA(data = NULL, geneset = NULL, nperm = 10, c = 0, KeepPerm=FALSE,
  parallel = FALSE, BPparam = MulticoreParam(workers = 4))

Arguments

data

A list with ID (a character vector for subject ID), pheno (a data frame with each column being one clinical outcome), gene (a data frame with each column being one gene).

geneset

A list of gene sets of interests (the output of readGMT function).

nperm

An integer number of permutations performed to get P values.

c

An integer cutoff value for the overlapping number of genes between the data and the gene set.

KeepPerm

A logical value indicating if the permutation statistics are kept. If there are a large number of gene sets and the number of permutation is large, the matrix of the permutation statistics could be large and memory demanding.

parallel

A logical value indicating if parallel computing is wanted.

BPparam

Parameters to configure parallel evaluation environments if parallel is TRUE. The default value is to use 4 cores in a single machine. See BiocParallelParam object in Bioconductor package BiocParallel for more details.

Value

Returns a data frame with following columns, if KeepPerm=FALSE; otherwise, returns a list with two objects: "res" object being the following data frame and "stat" being the permutation statistics.

Geneset

Names for the gene sets.

TotalSize

The original size of each gene set.

OverlapSize

The overlapping number of genes between the data and the gene set.

Stats

Longitudinal distance covariance between the clinical outcomes and the gene set.

NormScore

Only available when permutation is performed. Normalized longitudinal distance covariance using the mean and standard deviation of permutated values.

P.perm

Only available when permutation is performed. Permutation P values.

P.approx

P values obtained using normal distribution to approximate the null distribution.

FDR.approx

FDR based on the P.approx.

References

Distance-correlation based Gene Set Analysis in Longitudinal Studies. Jiehuan Sun, Jose Herazo-Maya, Xiu Huang, Naftali Kaminski, and Hongyu Zhao.

Examples

data(dcGSAtest)
fpath <- system.file("extdata", "sample.gmt.txt", package="dcGSA")
GS <- readGMT(file=fpath)
system.time(res <- dcGSA(data=dcGSAtest,geneset=GS,nperm=100))
head(res)

[Package dcGSA version 1.14.0 Index]