calcGC {genoset} | R Documentation |
Local GC content can be used to remove GC artifacts from copynumber data (see Diskin et al, Nucleic Acids Research, 2008, PMID: 18784189). This function will calculate GC content fraction in expanded windows around a set of ranges following example in http://www.bioconductor.org/help/course-materials/2012/useR2012/Bioconductor-tutorial.pdf. Currently all ranges are tabulated, later I may do letterFrequencyInSlidingWindow for big windows and then match to the nearest.
calcGC(object, bsgenome, expand = 1e+06, bases = c("G", "C"))
object |
GenomicRanges or GenoSet |
bsgenome |
BSgenome, like Hsapiens from BSgenome.Hsapiens.UCSC.hg19 or DNAStringSet. |
expand |
scalar integer, amount to expand each range before calculating gc |
bases |
character, alphabet to count, usually c("G", "C"), but "N" is useful too |
named numeric vector, fraction of nucleotides that are G or C in expanded ranges of object
## Not run: library(BSgenome.Hsapiens.UCSC.hg19) ## Not run: gc = calcGC(genoset.ds, Hsapiens)