In this vignette, we demonstrate the unsegmented block bootstrap functionality implemented in nullranges. “Unsegmented” refers to the fact that this implementation does not consider segmentation of the genome for sampling of blocks, see the segmented block bootstrap vignette for the alternative implementation.
First we use the DNase hypersensitivity peaks in A549 downloaded from AnnotationHub, and pre-processed as described in the nullrangesOldData package.
The following chunk of code evaluates various types of bootstrap/permutation schemes, first within chromosome, and then across chromosome (the default). The default type
is bootstrap, and the default for withinChrom
is FALSE
(bootstrapping with blocks moving across chromosomes).
set.seed(5) # reproducibility
library(microbenchmark)
blockLength <- 5e5
microbenchmark(
list=alist(
p_within=bootRanges(dhs, blockLength=blockLength,
type="permute", withinChrom=TRUE),
b_within=bootRanges(dhs, blockLength=blockLength,
type="bootstrap", withinChrom=TRUE),
p_across=bootRanges(dhs, blockLength=blockLength,
type="permute", withinChrom=FALSE),
b_across=bootRanges(dhs, blockLength=blockLength,
type="bootstrap", withinChrom=FALSE)
), times=10)
## Unit: milliseconds
## expr min lq mean median uq max neval cld
## p_within 1010.6040 1031.6861 1056.2933 1056.4756 1073.5097 1112.6601 10 b
## b_within 881.9462 918.9972 965.7295 941.7946 1005.7543 1113.5343 10 b
## p_across 236.9382 243.4249 362.5208 262.4285 288.0741 1257.4925 10 a
## b_across 277.8253 279.9926 296.4042 291.0987 314.4386 327.5488 10 a
We create some synthetic ranges in order to visualize the different options of the unsegmented bootstrap implemented in nullranges.
library(GenomicRanges)
seq_nms <- rep(c("chr1","chr2","chr3"),c(4,5,2))
gr <- GRanges(seqnames=seq_nms,
IRanges(start=c(1,101,121,201,
101,201,216,231,401,
1,101),
width=c(20, 5, 5, 30,
20, 5, 5, 5, 30,
80, 40)),
seqlengths=c(chr1=300,chr2=450,chr3=200),
chr=factor(seq_nms))
The following function uses functionality from plotgardener to plot the ranges. Note in the plotting helper function that chr
will be used to color ranges by chromosome of origin.
suppressPackageStartupMessages(library(plotgardener))
plotGRanges <- function(gr) {
pageCreate(width = 5, height = 2, xgrid = 0,
ygrid = 0, showGuides = FALSE)
for (i in seq_along(seqlevels(gr))) {
chrom <- seqlevels(gr)[i]
chromend <- seqlengths(gr)[[chrom]]
suppressMessages({
p <- pgParams(chromstart = 0, chromend = chromend,
x = 0.5, width = 4*chromend/500, height = 0.5,
at = seq(0, chromend, 50),
fill = colorby("chr", palette=palette.colors))
prngs <- plotRanges(data = gr, params = p,
chrom = chrom,
y = 0.25 + (i-1)*.7,
just = c("left", "bottom"))
annoGenomeLabel(plot = prngs, params = p, y = 0.30 + (i-1)*.7)
})
}
}
Visualizing two permutations of blocks within chromosome:
for (i in 1:2) {
gr_prime <- bootRanges(gr, blockLength=100, type="permute", withinChrom=TRUE)
plotGRanges(gr_prime)
}
Visualizing two bootstraps within chromosome:
for (i in 1:2) {
gr_prime <- bootRanges(gr, blockLength=100, withinChrom=TRUE)
plotGRanges(gr_prime)
}
Visualizing two permutations of blocks across chromosome. Here we use larger blocks than previously.
for (i in 1:2) {
gr_prime <- bootRanges(gr, blockLength=200, type="permute", withinChrom=FALSE)
plotGRanges(gr_prime)
}
Visualizing two bootstraps across chromosome:
for (i in 1:2) {
gr_prime <- bootRanges(gr, blockLength=200, withinChrom=FALSE)
plotGRanges(gr_prime)
}
## R version 4.2.0 RC (2022-04-19 r82224 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows Server x64 (build 20348)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=C
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## attached base packages:
## [1] grid stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] microbenchmark_1.4.9 tidyr_1.2.0
## [3] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.20.0
## [5] AnnotationFilter_1.20.0 GenomicFeatures_1.48.0
## [7] AnnotationDbi_1.58.0 patchwork_1.1.1
## [9] plyranges_1.16.0 nullrangesData_1.1.1
## [11] ExperimentHub_2.4.0 AnnotationHub_3.4.0
## [13] BiocFileCache_2.4.0 dbplyr_2.1.1
## [15] ggplot2_3.3.5 plotgardener_1.2.0
## [17] nullranges_1.2.0 InteractionSet_1.24.0
## [19] SummarizedExperiment_1.26.0 Biobase_2.56.0
## [21] MatrixGenerics_1.8.0 matrixStats_0.62.0
## [23] GenomicRanges_1.48.0 GenomeInfoDb_1.32.0
## [25] IRanges_2.30.0 S4Vectors_0.34.0
## [27] BiocGenerics_0.42.0
##
## loaded via a namespace (and not attached):
## [1] plyr_1.8.7 RcppHMM_1.2.2
## [3] lazyeval_0.2.2 splines_4.2.0
## [5] BiocParallel_1.30.0 TH.data_1.1-1
## [7] digest_0.6.29 yulab.utils_0.0.4
## [9] htmltools_0.5.2 fansi_1.0.3
## [11] magrittr_2.0.3 memoise_2.0.1
## [13] ks_1.13.5 Biostrings_2.64.0
## [15] sandwich_3.0-1 prettyunits_1.1.1
## [17] jpeg_0.1-9 colorspace_2.0-3
## [19] blob_1.2.3 rappdirs_0.3.3
## [21] xfun_0.30 dplyr_1.0.8
## [23] crayon_1.5.1 RCurl_1.98-1.6
## [25] jsonlite_1.8.0 survival_3.3-1
## [27] zoo_1.8-10 glue_1.6.2
## [29] gtable_0.3.0 zlibbioc_1.42.0
## [31] XVector_0.36.0 strawr_0.0.9
## [33] DelayedArray_0.22.0 scales_1.2.0
## [35] mvtnorm_1.1-3 DBI_1.1.2
## [37] Rcpp_1.0.8.3 xtable_1.8-4
## [39] progress_1.2.2 gridGraphics_0.5-1
## [41] bit_4.0.4 mclust_5.4.9
## [43] httr_1.4.2 RColorBrewer_1.1-3
## [45] speedglm_0.3-4 ellipsis_0.3.2
## [47] pkgconfig_2.0.3 XML_3.99-0.9
## [49] farver_2.1.0 sass_0.4.1
## [51] utf8_1.2.2 DNAcopy_1.70.0
## [53] ggplotify_0.1.0 tidyselect_1.1.2
## [55] labeling_0.4.2 rlang_1.0.2
## [57] later_1.3.0 munsell_0.5.0
## [59] BiocVersion_3.15.2 tools_4.2.0
## [61] cachem_1.0.6 cli_3.3.0
## [63] generics_0.1.2 RSQLite_2.2.12
## [65] ggridges_0.5.3 evaluate_0.15
## [67] stringr_1.4.0 fastmap_1.1.0
## [69] yaml_2.3.5 knitr_1.38
## [71] bit64_4.0.5 purrr_0.3.4
## [73] KEGGREST_1.36.0 mime_0.12
## [75] pracma_2.3.8 xml2_1.3.3
## [77] biomaRt_2.52.0 compiler_4.2.0
## [79] filelock_1.0.2 curl_4.3.2
## [81] png_0.1-7 interactiveDisplayBase_1.34.0
## [83] tibble_3.1.6 bslib_0.3.1
## [85] stringi_1.7.6 highr_0.9
## [87] lattice_0.20-45 ProtGenerics_1.28.0
## [89] Matrix_1.4-1 vctrs_0.4.1
## [91] pillar_1.7.0 lifecycle_1.0.1
## [93] BiocManager_1.30.17 jquerylib_0.1.4
## [95] data.table_1.14.2 bitops_1.0-7
## [97] httpuv_1.6.5 rtracklayer_1.56.0
## [99] R6_2.5.1 BiocIO_1.6.0
## [101] promises_1.2.0.1 KernSmooth_2.23-20
## [103] codetools_0.2-18 MASS_7.3-57
## [105] assertthat_0.2.1 rjson_0.2.21
## [107] withr_2.5.0 GenomicAlignments_1.32.0
## [109] Rsamtools_2.12.0 multcomp_1.4-19
## [111] GenomeInfoDbData_1.2.8 parallel_4.2.0
## [113] hms_1.1.1 rmarkdown_2.14
## [115] shiny_1.7.1 restfulr_0.0.13