mut_matrix_stranded {MutationalPatterns} | R Documentation |
Make a mutation count matrix with 192 features: 96 trinucleotides and 2 strands, these can be transcription or replication strand
mut_matrix_stranded(vcf_list, ref_genome, ranges, mode = "transcription")
vcf_list |
List of collapsed vcf objects |
ref_genome |
BSGenome reference genome object |
ranges |
GRanges object with the genomic ranges of: 1. (transcription mode) the gene bodies with strand (+/-) information, or 2. (replication mode) the replication strand with 'strand_info' metadata |
mode |
"transcription" or "replication", default = "transcription" |
192 mutation count matrix (96 X 2 strands)
read_vcfs_as_granges
,
mut_matrix
,
mut_strand
## See the 'read_vcfs_as_granges()' example for how we obtained the ## following data: vcfs <- readRDS(system.file("states/read_vcfs_as_granges_output.rds", package="MutationalPatterns")) ## Load the corresponding reference genome. ref_genome = "BSgenome.Hsapiens.UCSC.hg19" library(ref_genome, character.only = TRUE) ## Transcription strand analysis: ## You can obtain the known genes from the UCSC hg19 dataset using ## Bioconductor: # if (!requireNamespace("BiocManager", quietly=TRUE)) # install.packages("BiocManager") # BiocManager::install("TxDb.Hsapiens.UCSC.hg19.knownGene") # library("TxDb.Hsapiens.UCSC.hg19.knownGene") ## For this example, we preloaded the data for you: genes_hg19 <- readRDS(system.file("states/genes_hg19.rds", package="MutationalPatterns")) mut_mat_s = mut_matrix_stranded(vcfs, ref_genome, genes_hg19, mode = "transcription") ## Replication strand analysis: ## Read example bed file with replication direction annotation repli_file = system.file("extdata/ReplicationDirectionRegions.bed", package = "MutationalPatterns") repli_strand = read.table(repli_file, header = TRUE) repli_strand_granges = GRanges(seqnames = repli_strand$Chr, ranges = IRanges(start = repli_strand$Start + 1, end = repli_strand$Stop), strand_info = repli_strand$Class) ## UCSC seqlevelsstyle seqlevelsStyle(repli_strand_granges) = "UCSC" # The levels determine the order in which the features # will be countend and plotted in the downstream analyses # You can specify your preferred order of the levels: repli_strand_granges$strand_info = factor(repli_strand_granges$strand_info, levels = c("left", "right")) mut_mat_s_rep = mut_matrix_stranded(vcfs, ref_genome, repli_strand_granges, mode = "replication")