utils-motif {universalmotif}R Documentation

Motif-related utility functions.

Description

Motif-related utility functions.

Usage

add_gap(motif, gaploc = ncol(motif)%/%2, mingap = 1, maxgap = 5)

compare_columns(x, y, method, bkg1 = rep(1/length(x), length(x)),
  bkg2 = rep(1/length(y), length(y)), nsites1 = 100, nsites2 = 100)

consensus_to_ppm(letter)

consensus_to_ppmAA(letter)

get_consensus(position, alphabet = "DNA", type = "PPM", pseudocount = 1)

get_consensusAA(position, type = "PPM", pseudocount = 0)

get_matches(motif, score)

get_scores(motif)

icm_to_ppm(position)

motif_score(motif, threshold = c(0, 1), use.freq = 1)

log_string_pval(pval)

pcm_to_ppm(position, pseudocount = 0)

position_icscore(position, bkg = numeric(), type = "PPM",
  pseudocount = 1, nsites = 100, relative_entropy = FALSE,
  schneider_correction = FALSE)

ppm_to_icm(position, bkg = numeric(), schneider_correction = FALSE,
  nsites = 100, relative_entropy = FALSE)

ppm_to_pcm(position, nsites = 100)

ppm_to_pwm(position, bkg = numeric(), pseudocount = 1, nsites = 100,
  smooth = TRUE)

pwm_to_ppm(position, bkg = numeric())

round_motif(motif, pct.tolerance = 0.05)

score_match(motif, match)

summarise_motifs(motifs, na.rm = TRUE)

ungap(motif, delete = FALSE)

Arguments

motif

Motif object to calculate scores from, or add/remove gap, or round.

gaploc

numeric Motif gap locations. The gap occurs immediately after every position value. If missing, uses round(ncol(motif) / 2).

mingap

numeric Minimum gap size. Must have one value for every location. If missing, set to 1.

maxgap

numeric Maximum gap size. Must have one value for every location. If missing, set to 5.

x

numeric First column for comparison.

y

numeric Second column for comparison.

method

character(1) Column comparison metric. See compare_motifs() for details.

bkg1

numeric Vector of background probabilities for the first column. Only relevant if method = "ALLR".

bkg2

numeric Vector of background probabilities for the second column. Only relevant if method = "ALLR".

nsites1

numeric(1) Number of sites for the first column. Only relevant if method = "ALLR".

nsites2

numeric(1) Number of sites for the second column. Only relevant if method = "ALLR".

letter

character(1) Any DNA, RNA, or AA IUPAC letter. Ambiguity letters are accepted.

position

numeric A numeric vector representing the frequency or probability for each alphabet letter at a specific position.

alphabet

character(1) One of c('DNA', 'RNA').

type

character(1) One of c('PCM', 'PPM', 'PWM' 'ICM').

pseudocount

numeric(1) Used to prevent zeroes in motif matrix.

score

numeric(1) Logodds motif score.

threshold

numeric(1) Any number of numeric values between 0 and 1 representing score percentage.

use.freq

numeric(1) Use regular motif or the respective multifreq representation.

pval

character(1) String-formatted p-value.

bkg

numeric Should be the same length as the alphabet length.

nsites

numeric(1) Number of sites motif originated from.

relative_entropy

logical(1) Calculate information content as relative entropy or Kullback-Leibler divergence.

schneider_correction

logical(1) Apply sample size correction.

smooth

logical(1) Apply pseudocount correction.

pct.tolerance

numeric(1) or character(1) The minimum tolerated proportion each letter must represent per position in order not to be rounded off, either as a numeric value from 0 to 1 or a percentage written as a string from "0%" to "100%".

match

character(1) Sequence string to calculate score from.

motifs

list A list of universalmotif motifs.

na.rm

logical Remove columns where all values are NA.

delete

logical(1) Clear gap information from motif. If FALSE, then it can be reactivated simply with add_gap(motif).

Value

For consensus_to_ppm() and consensus_to_ppmAA(): a numeric vector of length 4 and 20, respectively.

For get_consensus() and get_consensusAA(): a character vector of length 1.

For get_matches(): a character vector of motif matches.

For motif_score(): a named numeric vector of motif scores.

For log_string_pval(): a numeric vector of length 1.

For position_icscore(): a numeric vector of length 1.

For ppm_to_icm(), icm_to_ppm(), pcm_to_ppm(), ppm_to_pcm(), ppm_to_pwm(), and pwm_to_ppm(): a numeric vector with length equal to input numeric vector.

For round_motif(): the input motif, rounded.

For score_match(): a numeric vector with the match motif score.

For summarise_motifs(): a data.frame with columns representing the universalmotif slots.

Author(s)

Benjamin Jean-Marie Tremblay, b2tremblay@uwaterloo.ca

See Also

create_motif()

Examples

data(examplemotif)

#######################################################################
## add_gap
## Add gap information to a motif.
m <- create_motif()
# Add a gap size 5-8 between positions 4 and 5:
m <- add_gap(m, gaploc = 4, mingap = 5, maxgap = 8)

#######################################################################
## compare_columns
## Compare two numeric vectors using the metrics from compare_motifs()
compare_columns(c(0.5, 0.1, 0.1, 0.2), c(0.7, 0.1, 0.1, 0.1), "PCC")

#######################################################################
## consensus_to_ppm
## Do the opposite of get_consensus. Note that loss of information is
## inevitable. Generates a sequence matrix.
sapply(c("A", "G", "T", "B"), consensus_to_ppm)

#######################################################################
## consensus_to_ppmAA
## Do the opposite of get_consensusAA and generate a motif matrix.
sapply(c("V", "A", "L"), consensus_to_ppmAA)

#######################################################################
## get_consensus
## Get a consensus string from a DNA/RNA motif.
m <- create_motif()["motif"]
apply(m, 2, get_consensus)

#######################################################################
## get_consensusAA
## Get a consensus string from an amino acid motif. Unless each position
## is clearly dominated by a single amino acid, the resulting string will
## likely be useless.
m <- create_motif(alphabet = "AA")["motif"]
apply(m, 2, get_consensusAA, type = "PPM")

#######################################################################
## get_match
## Get all possible motif matches above input score
get_matches(examplemotif, 10)

#######################################################################
## get_scores
## Get all possible scores for a motif
length(get_scores(examplemotif))

#######################################################################
## icm_to_ppm
## Do the opposite of ppm_to_icm.
m <- create_motif(type = "ICM")["motif"]
apply(m, 2, icm_to_ppm)

#######################################################################
## motif_score
## Calculate motif score from different thresholds
m <- normalize(examplemotif)
motif_score(m, c(0, 0.8, 1))

#######################################################################
## log_string_pval
## Get the log of a string-formatted p-value
log_string_pval("1e-400")

#######################################################################
## pcm_to_ppm
## Go from a count type motif to a probability type motif.
m <- create_motif(type = "PCM", nsites = 50)["motif"]
apply(m, 2, pcm_to_ppm, pseudocount = 1)

#######################################################################
## position_icscore
## Similar to ppm_to_icm, except this calculates the position sum.
m <- create_motif()["motif"]
apply(m, 2, position_icscore, type = "PPM", bkg = rep(0.25, 4))

#######################################################################
## ppm_to_icm
## Convert one column from a probability type motif to an information
## content type motif.
m <- create_motif(nsites = 100, pseudocount = 0.8)["motif"]
apply(m, 2, ppm_to_icm, nsites = 100, bkg = rep(0.25, 4))

#######################################################################
## ppm_to_pcm
## Do the opposite of pcm_to_ppm.
m <- create_motif()["motif"]
apply(m, 2, ppm_to_pcm, nsites = 50)

#######################################################################
## ppm_to_pwm
## Go from a probability type motif to a weight type motif.
m <- create_motif()["motif"]
apply(m, 2, ppm_to_pwm, nsites = 100, bkg = rep(0.25, 4))

#######################################################################
## pwm_to_ppm
## Do the opposite of ppm_to_pwm.
m <- create_motif(type = "PWM")["motif"]
apply(m, 2, pwm_to_ppm, bkg = rep(0.25, 4))

#######################################################################
## Note that not all type conversions can be done directly; for those
## type conversions which are unavailable, universalmotif just chains
## together others (i.e. from PCM -> ICM => pcm_to_ppm -> ppm_to_icm)

#######################################################################
## round_motif
## Round down letter scores to 0
m <- create_motif()
## Remove letters from positions which are less than 5% of the total
## position:
round_motif(m, pct.tolerance = 0.05)

#######################################################################
## score_match
## Calculate score of a particular match
score_match(examplemotif, "TATATAT")
score_match(examplemotif, "TATATAG")

#######################################################################
## summarise_motifs
## Create a data.frame of information based on a list of motifs.
m1 <- create_motif()
m2 <- create_motif()
m3 <- create_motif()
summarise_motifs(list(m1, m2, m3))

#######################################################################
## ungap
## Unset motif's gap status. Does not delete actual gap data unless
## delete = TRUE.
m <- create_motif()
m <- add_gap(m, 3, 2, 4)
m <- ungap(m)
# Restore gap data:
m <- add_gap(m)


[Package universalmotif version 1.6.4 Index]