runGenphen {genphen}R Documentation

Genetic association analysis using Bayesian inference and statistical learning methods

Description

Given a set of genotypes such as single nucleotide polymorphisms (SNPs) or single amino acid polymorphisms (SAAPs) for a set of N individuals, and the corresponding N phenotypes, genphen computes severaly genotype-phenotypes association scores using Bayesian inference and statistical learning.

Usage

runGenphen(genotype, phenotype, phenotype.type,
           mcmc.chains, mcmc.iterations, mcmc.warmup,
           mcmc.cores, hdi.level, stat.learn.method,
           cv.iterations, with.rpa, rpa.iterations,
           rpa.rope)

Arguments

genotype

Character matrix/data frame or a vector, containing SNPs/SAAPs as columns or alternatively as DNAMultipleAlignment or AAMultipleAlignment Biostrings object.

phenotype

Numerical vector for continuous-phenotype analysis, numerical or character vector for dichotonous-phenotype analysis.

phenotype.type

'continuous' or 'dichotomous' based on phenotype type.

mcmc.chains

Number of MCMC chains used to test each association test. We recomend mcmc.chains >= 2.

mcmc.iterations

Length of MCMC chains (default = 1,000).

mcmc.warmup

Length of adaptive MCMC chains (default = 500).

mcmc.cores

Number of cores used for the MCMC (default = 1). The same parameter is for multicore execution of the statistical learning procedures.

hdi.level

Highest density interval (HDI) (default = 0.95).

stat.learn.method

Character parameter used to specify the statistical learning method used in the analysis. Currently two methods are available: random forest ('rf') and support vector machine ('svm'). For no statistical learning select 'none'.

cv.iterations

cross-validation iterations (default = 1,000).

with.rpa

with retrospective power analysis (RPA).

rpa.iterations

Number of simulation used to compute RPA scores.

rpa.rope

Region of practical equivalence (ROPE) for the RPA.

Details

Input:

Goal: To quantify the association between each genotype and phenotype. With genphen, we provide several measures of association:

Value

General parameters:

site

id of the site (e.g. position in the provided sequence alignment)

mutation

type of polymorphism (e.g. T->A)

data

number of data points for each allele (e.g. A:10, T:20)

Association score parameters:

cohens.d or absolute.d

Cohen's d effect size (continuous phenotype analysis) or absolute effect size (dichotomous phenotype analysis). point estimate

cohens.d.L/cohens.d.H or absolute.d.L/absolute.d.H

The highest density interval (HDI) of the estimated effect size.

ca, ca.L, ca.H

Classification accuracy (CA) estimate and its HDI

kappa, kappa.L, kappa.H

Cohen's kappa estimate and its HDI

bc

Bhattacharyya coefficient, degree of overlap between the posterior predicted distributions of the phenotype in the two alleles of a SNP (or two amino acid states of an SAAP.

sd.d, sd.d.L, sd.d.H

(only for continuous phenotypes) Difference in standard deviations its HDI

MCMC convergence parameters:

s, g, n

s=site, g=genotype, n=number of observations

mu.rhat, sigma.rhat

Potential scale reduction factor from the MCMC simulation for each parameter

mu.ess, sigma.ess

Effective sampling size from the MCMC simulation for each parameter

divergence

Indicator of occuring divergences during the MCMC simulation

treedepth

Indicator of treedepth exceptions during the MCMC simulation

RPA parameters:

s, g, n

s=site, g=genotype, n=number of observations

rpa.count

Number of RPA runs in which a significant effect was found

Posterior predictive check parameters:

s, g, n

s=site, g=genotype, n=number of observations

predicted.mu.i, predicted.mu.j, real.mu.i, real.mu.j

Predicted and real means in each group of a SNP

Author(s)

Simo Kitanovski <simo.kitanovski@uni-due.de>

See Also

runDiagnostics, runPhyloBiasCheck

Examples

# I: Continuous phenotype analysis

# genotype inputs:
data(genotype.saap)
# phenotype inputs:
data(phenotype.saap)

# run genphen
continuous.analysis <- runGenphen(genotype = genotype.saap[, 1:3],
                                  phenotype = phenotype.saap,
                                  phenotype.type = "continuous",
                                  mcmc.chains = 2,
                                  mcmc.iterations = 2000,
                                  mcmc.warmup = 500,
                                  mcmc.cores = 2,
                                  hdi.level = 0.95,
                                  stat.learn.method = "rf",
                                  cv.iterations = 500,
                                  with.rpa = FALSE,
                                  rpa.iterations = 100,
                                  rpa.rope = 0)

# II: Dichotomous phenotype analysis

# genotype inputs:
data(genotype.saap)
# phenotype inputs:
data(dichotomous.phenotype.saap)

# run genphen
dichotomous.analysis <- runGenphen(genotype = genotype.saap[, 1:3],
                                   phenotype = dichotomous.phenotype.saap,
                                   phenotype.type = "dichotomous",
                                   mcmc.chains = 2,
                                   mcmc.iterations = 2000,
                                   mcmc.warmup = 500,
                                   mcmc.cores = 2,
                                   hdi.level = 0.95,
                                   stat.learn.method = "rf",
                                   cv.iterations = 500,
                                   with.rpa = FALSE,
                                   rpa.iterations = 100,
                                   rpa.rope = 0)

[Package genphen version 1.8.0 Index]