HDF5Array-class {HDF5Array} | R Documentation |
We provide 2 classes for representing an (on-disk) HDF5 dataset as an array-like object in R:
HDF5Array: A high-level class HDF5Array that extends DelayedArray. All the operations available on DelayedArray objects work on HDF5Array objects.
HDF5ArraySeed: A low-level class for pointing to an HDF5 dataset. No operation can be performed directly on an HDF5ArraySeed object. It needs to be wrapped in a DelayedArray or HDF5Array object first. An HDF5Array object is just an HDF5ArraySeed object wrapped in a DelayedArray object.
## Constructor functions: HDF5Array(filepath, name, type=NA) HDF5ArraySeed(filepath, name, type=NA)
filepath |
The path (as a single character string) to the HDF5 file where the dataset is located. |
name |
The name of the dataset in the HDF5 file. |
type |
|
An HDF5Array object for HDF5Array()
.
An HDF5ArraySeed object for HDF5ArraySeed()
.
DelayedArray objects.
DelayedArray-utils for common operations on DelayedArray objects.
writeHDF5Array
for writting an array-like object
to an HDF5 file.
HDF5-dump-management for controlling the location of automatically created HDF5 datasets.
saveHDF5SummarizedExperiment
and
loadHDF5SummarizedExperiment
in this
package (the HDF5Array package) for saving/loading
an HDF5-based SummarizedExperiment
object to/from disk.
h5ls
in the rhdf5 package.
The rhdf5 package on top of which HDF5Array objects are implemented.
array objects in base R.
## --------------------------------------------------------------------- ## CONSTRUCTION ## --------------------------------------------------------------------- library(rhdf5) library(h5vcData) tally_file <- system.file("extdata", "example.tally.hfs5", package="h5vcData") h5ls(tally_file) ## Pick up "Coverages" dataset for Human chromosome 16: cov0 <- HDF5Array(tally_file, "/ExampleStudy/16/Coverages") cov0 ## --------------------------------------------------------------------- ## dim/dimnames ## --------------------------------------------------------------------- dim(cov0) dimnames(cov0) dimnames(cov0) <- list(paste0("s", 1:6), c("+", "-")) dimnames(cov0) ## --------------------------------------------------------------------- ## SLICING (A.K.A. SUBSETTING) ## --------------------------------------------------------------------- cov1 <- cov0[ , , 29000001:29000007] cov1 dim(cov1) as.array(cov1) stopifnot(identical(dim(as.array(cov1)), dim(cov1))) stopifnot(identical(dimnames(as.array(cov1)), dimnames(cov1))) cov2 <- cov0[ , "+", 29000001:29000007] cov2 as.matrix(cov2) ## --------------------------------------------------------------------- ## SummarizedExperiment OBJECTS WITH DELAYED ASSAYS ## --------------------------------------------------------------------- ## DelayedArray objects can be used inside a SummarizedExperiment object ## to hold the assay data and to delay operations on them. library(SummarizedExperiment) pcov <- cov0[ , 1, ] # coverage on plus strand mcov <- cov0[ , 2, ] # coverage on minus strand nrow(pcov) # nb of samples ncol(pcov) # length of Human chromosome 16 ## The convention for a SummarizedExperiment object is to have 1 column ## per sample so first we need to transpose 'pcov' and 'mcov': pcov <- t(pcov) mcov <- t(mcov) se <- SummarizedExperiment(list(pcov=pcov, mcov=mcov)) se stopifnot(validObject(se, complete=TRUE)) ## A GPos object can be used to represent the genomic positions along ## the dataset: gpos <- GPos(GRanges("16", IRanges(1, nrow(se)))) gpos rowRanges(se) <- gpos se stopifnot(validObject(se)) assays(se)$pcov assays(se)$mcov