findLoci {bsseq} | R Documentation |
This is a convenience function to find methylation loci, such as CpGs, in a reference genome.
The result is useful as the value of the loci
argument for read.bismark()
.
findLoci(pattern, subject, include = seqlevels(subject), strand = c("*", "+", "-"), fixed = "subject", resize = TRUE)
pattern |
A string specifying the pattern to search for, e.g. |
subject |
A string containing a file path to the genome sequence, in FASTA or 2bit format, to be searched. Alternatively, a BSgenome or DNAStringSet object storing the genome sequence to be searched. |
include |
A character vector indicating the seqlevels of |
strand |
A character scaler specifying the strand of |
fixed |
If |
resize |
A logical scalar specifying whether the ranges should be shifted to have width 1 and anchored by the start of the locus, e.g., resize a CpG dinucleotide to give the co-ordinates of the cytosine. |
This function provides a convenience wrapper for finding methylation loci in a genome, based on running vmatchPattern()
.
Users requiring finer-grained control should directly use the vmatchPattern()
function and coerce the result to a GRanges object.
A GRanges object storing the found loci.
Peter F. Hickey
Biostrings::vmatchPattern()
?BSgenome::`BSgenome-utils`
library(Biostrings) # Find CpG dinucleotides on the both strands of an artificial sequence my_seq <- DNAStringSet("ATCAGTCGC") names(my_seq) <- "test" findLoci(pattern = "CG", subject = my_seq) # Find CHG trinucleotides on the both strands of an artificial sequence findLoci(pattern = "CHG", subject = my_seq) # Find CpG dinucleotides on the + strand of chr17 from the human genome (hg38) if (requireNamespace("BSgenome.Hsapiens.UCSC.hg38")) { findLoci( pattern = "CG", subject = BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38, include = "chr17", strand = "+") }