DelayedArray-utils {DelayedArray} | R Documentation |
Common operations on DelayedArray objects.
The operations currently supported on DelayedArray objects are:
Delayed operations:
rbind
and cbind
sweep
!
is.na
, is.finite
, is.infinite
, is.nan
lengths
nchar
, tolower
, toupper
,
grepl
, sub
, gsub
pmax2
and pmin2
t
statistical functions like dnorm
, dbinom
, dpois
,
and dlogis
(for the Normal, Binomial, Poisson, and Logistic
distribution, respectively) and related functions (documented in
DelayedArray-stats)
Block-processed operations:
anyNA
, which
unique
, table
all the members of the Summary
group
mean
apply
matrix multiplication (%*%) of an ordinary matrix by a DelayedMatrix object
matrix row/col summarization (see
?`DelayedMatrix-stats`
)
cbind
in the base package for
rbind/cbind'ing ordinary arrays.
arbind
and acbind
in this package
(DelayedArray) for binding ordinary arrays of arbitrary
dimensions along their rows or columns.
is.na
, !
,
table
, mean
,
apply
, and %*%
in the
base package for the corresponding operations on ordinary
arrays or matrices.
DelayedArray-stats for statistical functions on DelayedArray objects.
DelayedMatrix-stats for DelayedMatrix row/col summarization.
setRealizationBackend
for how to set a
realization backend.
writeHDF5Array
in the HDF5Array
package for writing an array-like object to an HDF5 file and other
low-level utilities to control the location of automatically created
HDF5 datasets.
DelayedArray objects.
HDF5Array objects in the HDF5Array package.
S4groupGeneric
in the methods package
for the members of the Ops
,
Math
, and Math2
groups.
array objects in base R.
## --------------------------------------------------------------------- ## BIND DelayedArray OBJECTS ## --------------------------------------------------------------------- ## DelayedArray objects can be bound along their 1st (rows) or 2nd ## (columns) dimension with rbind() or cbind(). These operations are ## equivalent to arbind() and acbind(), respectively, and are all ## delayed. ## On 2D objects: library(HDF5Array) toy_h5 <- system.file("extdata", "toy.h5", package="HDF5Array") h5ls(toy_h5) M1 <- HDF5Array(toy_h5, "M1") M2 <- HDF5Array(toy_h5, "M2") M12 <- rbind(M1, t(M2)) # delayed M12 colMeans(M12) # block-processed ## On objects with more than 2 dimensions: example(arbind) # to create arrays a1, a2, a3 A1 <- DelayedArray(a1) A2 <- DelayedArray(a2) A3 <- DelayedArray(a3) A123 <- rbind(A1, A2, A3) # delayed A123 ## On 1D objects: v1 <- array(11:15, 5, dimnames=list(LETTERS[1:5])) v2 <- array(letters[1:3]) V1 <- DelayedArray(v1) V2 <- DelayedArray(v2) V12 <- rbind(V1, V2) V12 ## Not run: cbind(V1, V2) # Error! (the objects to cbind() must have at least 2 # dimensions) ## End(Not run) ## Note that base::rbind() and base::cbind() do something completely ## different on ordinary arrays that are not matrices. They treat them ## as if they were vectors: rbind(a1, a2, a3) cbind(a1, a2, a3) rbind(v1, v2) cbind(v1, v2) ## Also note that DelayedArray objects of arbitrary dimensions can be ## stored inside a DataFrame object as long as they all have the same ## first dimension (nrow()): DF <- DataFrame(M=I(tail(M1, n=5)), A=I(A3), V=I(V1)) DF[-3, ] DF2 <- rbind(DF, DF) DF2$V ## Sanity checks: m1 <- as.matrix(M1) m2 <- as.matrix(M2) stopifnot(identical(rbind(m1, t(m2)), as.matrix(M12))) stopifnot(identical(arbind(a1, a2, a3), as.array(A123))) stopifnot(identical(arbind(v1, v2), as.array(V12))) stopifnot(identical(rbind(DF$M, DF$M), DF2$M)) stopifnot(identical(rbind(DF$A, DF$A), DF2$A)) stopifnot(identical(rbind(DF$V, DF$V), DF2$V)) ## --------------------------------------------------------------------- ## MORE OPERATIONS ## --------------------------------------------------------------------- M1 >= 0.5 & M1 < 0.75 # delayed log(M1) # delayed pmax2(M2, 0) # delayed ## table() is block-processed: a4 <- array(sample(50L, 2000000L, replace=TRUE), c(200, 4, 2500)) A4 <- as(a4, "HDF5Array") table(A4) a5 <- array(sample(20L, 2000000L, replace=TRUE), c(200, 4, 2500)) A5 <- as(a5, "HDF5Array") table(A5) A4 - 2 * A5 # delayed table(A4 - 2 * A5) # block-processed ## range() is block-processed: range(A4 - 2 * A5) range(M1) cmeans <- colMeans(M2) # block-processed sweep(M2, 2, cmeans) # delayed ## --------------------------------------------------------------------- ## MATRIX MULTIPLICATION ## --------------------------------------------------------------------- ## Matrix multiplication is not delayed: the output matrix is realized ## block by block. The current "realization backend" controls where ## realization happens e.g. in memory if set to NULL or in an HDF5 file ## if set to "HDF5Array". See '?realize' for more information about ## "realization backends". ## The output matrix is returned as a DelayedMatrix object with no delayed ## operations on it. The exact class of the object depends on the backend ## e.g. it will be HDF5Matrix with "HDF5Array" backend. m <- matrix(runif(50000), ncol=nrow(M1)) ## Set backend to NULL for in-memory realization: setRealizationBackend() P1 <- m %*% M1 P1 ## Set backend to HDF5Array for realization in HDF5 file: setRealizationBackend("HDF5Array") ## With the HDF5Array backend, the output matrix will be written to an ## automatic location on disk: getHDF5DumpFile() # HDF5 file where the output matrix will be written lsHDF5DumpFile() P2 <- m %*% M1 P2 lsHDF5DumpFile() ## Use setHDF5DumpFile() and setHDF5DumpName() from the HDF5Array package ## to control the location of automatically created HDF5 datasets. stopifnot(identical(as.array(P1), as.array(P2)))