setup_CPsSearch {InPAS} | R Documentation |
prepare data for predicting cleavage and polyadenylation (CP) sites
setup_CPsSearch( sqlite_db, genome, utr3, background = c("same_as_long_coverage_threshold", "1K", "5K", "10K", "50K"), TxDb = NA, removeScaffolds = FALSE, hugeData = TRUE, outdir, BPPARAM = NULL, silence = FALSE )
sqlite_db |
A path to the SQLite database for InPAS, i.e. the output of
|
genome |
An object of BSgenome::BSgenome |
utr3 |
An object of GenomicRanges::GRangesList, output of
|
background |
A character(1) vector, the range for calculating cutoff threshold of local background. It can be "same_as_long_coverage_threshold", "1K", "5K","10K", or "50K". |
TxDb |
an object of GenomicFeatures::TxDb |
removeScaffolds |
A logical(1) vector, whether the scaffolds should be removed from the genome If you use a TxDb containing alternative scaffolds, you'd better to remove the scaffolds. |
hugeData |
A logical(1) vector, indicating whether it is huge data |
outdir |
A character(1) vector, a path with write permission for storing the coverage data. If it doesn't exist, it will be created. |
BPPARAM |
an optional BiocParallel::BiocParallelParam instance determining the parallel back-end to be used during evaluation, or a list of BiocParallelParam instances, to be applied in sequence for nested calls to bplapply. It can be set to NULL or bpparam() |
silence |
report progress or not. By default it doesn't report progress. |
A list as described below:
chromosome-wise 3' UTR coverage in summarized View format
A filename for chr1 3' UTR coverage in summarized View format
A filename for chr2 3' UTR coverage in summarized View format
A filename for chrN 3' UTR coverage in summarized View format
The type of methods for bckground coverage calculation
Z-score cutoff thresholds for each 3' UTRs
A named vector containing depth weight
Jianhong Ou, Haibo Liu
@examples if (interactive()) library(BSgenome.Mmusculus.UCSC.mm10) library("TxDb.Mmusculus.UCSC.mm10.knownGene")
genome <- BSgenome.Mmusculus.UCSC.mm10 TxDb <- TxDb.Mmusculus.UCSC.mm10.knownGene ## load UTR3 annotation and convert it into a GRangesList data(utr3.mm10) utr3 <- split(utr3.mm10, seqnames(utr3.mm10)) bedgraphs <- system.file("extdata",c("Baf3.extract.bedgraph", "UM15.extract.bedgraph"), package = "InPAS") tags <- c("Baf3", "UM15") metadata <- data.frame(tag = tags, condition = c("Baf3", "UM15"), bedgraph_file = bedgraphs) outdir = tempdir() write.table(metadata, file =file.path(outdir, "metadata.txt"), sep = "\t", quote = FALSE, row.names = FALSE) sqlite_db <- setup_sqlitedb(metadata = file.path(outdir, "metadata.txt"), outdir) coverage <- list() for (i in seq_along(bedgraphs)){ coverage[[tags[i]]] <- get_ssRleCov(bedgraph = bedgraphs[i], tag = tags[i], genome = genome, sqlite_db = sqlite_db, outdir = outdir, removeScaffolds = TRUE) } coverage_files <- assemble_allCov(sqlite_db, outdir, genome, removeScaffolds = TRUE) data4CPsitesSearch <- setup_CPsSearch(sqlite_db, genome, utr3, background = "10K", TxDb = TxDb, removeScaffolds = TRUE, hugeData = TRUE, outdir = outdir)