seqMerge {SeqArray} | R Documentation |
Merges multiple SeqArray GDS files.
seqMerge(gds.fn, out.fn, storage.option="LZMA_RA", info.var=NULL, fmt.var=NULL, samp.var=NULL, optimize=TRUE, digest=TRUE, geno.pad=TRUE, verbose=TRUE)
gds.fn |
the file names of multiple GDS files |
out.fn |
the output file name |
storage.option |
specify the storage and compression option,
"ZIP_RA" ( |
info.var |
characters, the variable name(s) in the INFO field;
|
fmt.var |
characters, the variable name(s) in the FORMAT field;
|
samp.var |
characters, the variable name(s) in 'sample.annotation';
or |
optimize |
if |
digest |
a logical value (TRUE/FALSE) or a character ("md5", "sha1", "sha256", "sha384" or "sha512"); add md5 hash codes to the GDS file if TRUE or a digest algorithm is specified |
geno.pad |
TRUE, pad a 2-bit genotype array in bytes to avoid recompressing genotypes if possible |
verbose |
if |
The function merges multiple SeqArray GDS files. Users can specify the
compression method and level for the new GDS file. If gds.fn
contains
one file, users can change the storage type to create a new file.
WARNING: the functionality of seqMerge()
is limited.
Return the file name of GDS format with an absolute path.
Xiuwen Zheng
# the VCF file vcf.fn <- seqExampleFileName("vcf") # the number of variants total.count <- seqVCF_Header(vcf.fn, getnum=TRUE)$num.variant split.cnt <- 5 start <- integer(split.cnt) count <- integer(split.cnt) s <- (total.count+1) / split.cnt st <- 1L for (i in 1:split.cnt) { z <- round(s * i) start[i] <- st count[i] <- z - st st <- z } fn <- paste0("tmp", 1:split.cnt, ".gds") # convert to 5 gds files for (i in 1:split.cnt) { seqVCF2GDS(vcf.fn, fn[i], storage.option="ZIP_RA", start=start[i], count=count[i]) } # merge seqMerge(fn, "tmp.gds", storage.option="ZIP_RA") seqSummary("tmp.gds") #### vcf.fn <- seqExampleFileName("gds") file.copy(vcf.fn, "test.gds", overwrite=TRUE) # modify 'sample.id' f <- openfn.gds("test.gds", FALSE) sid <- read.gdsn(index.gdsn(f, "sample.id")) add.gdsn(f, "sample.id", paste("S", 1:length(sid)), replace=TRUE) closefn.gds(f) # merging seqMerge(c(vcf.fn, "test.gds"), "output.gds", storage.option="ZIP_RA") # delete the temporary files unlink(c("tmp.gds", "test.gds", "output.gds"), force=TRUE) unlink(fn, force=TRUE)