SEtools 1.10.0
The SEtools package is a set of convenience functions for the Bioconductor class SummarizedExperiment. It facilitates merging, melting, and plotting SummarizedExperiment objects.
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("SEtools")NOTE that the heatmap-related functions have been moved to a standalone package, sechm.
Or, to install the latest development version:
BiocManager::install("plger/SEtools")To showcase the main functions, we will use an example object which contains (a subset of) whole-hippocampus RNAseq of mice after different stressors:
suppressPackageStartupMessages({
  library(SummarizedExperiment)
  library(SEtools)
})
data("SE", package="SEtools")
SE## class: SummarizedExperiment 
## dim: 100 20 
## metadata(0):
## assays(2): counts logcpm
## rownames(100): Egr1 Nr4a1 ... CH36-200G6.4 Bhlhe22
## rowData names(2): meanCPM meanTPM
## colnames(20): HC.Homecage.1 HC.Homecage.2 ... HC.Swim.4 HC.Swim.5
## colData names(2): Region ConditionThis is taken from Floriou-Servou et al., Biol Psychiatry 2018.
se1 <- SE[,1:10]
se2 <- SE[,11:20]
se3 <- mergeSEs( list(se1=se1, se2=se2) )
se3## class: SummarizedExperiment 
## dim: 100 20 
## metadata(3): se1 se2 anno_colors
## assays(2): counts logcpm
## rownames(100): AC139063.2 Actr6 ... Zfp667 Zfp930
## rowData names(2): meanCPM meanTPM
## colnames(20): se1.HC.Homecage.1 se1.HC.Homecage.2 ... se2.HC.Swim.4
##   se2.HC.Swim.5
## colData names(3): Dataset Region ConditionAll assays were merged, along with rowData and colData slots.
By default, row z-scores are calculated for each object when merging. This can be prevented with:
se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE)If more than one assay is present, one can specify a different scaling behavior for each assay:
se3 <- mergeSEs( list(se1=se1, se2=se2), use.assays=c("counts", "logcpm"), do.scale=c(FALSE, TRUE))It is also possible to merge by rowData columns, which are specified through the mergeBy argument.
In this case, one can have one-to-many and many-to-many mappings, in which case two behaviors are possible:
aggFun, the features of each object will by aggregated by mergeBy using this function before merging.rowData(se1)$metafeature <- sample(LETTERS,nrow(se1),replace = TRUE)
rowData(se2)$metafeature <- sample(LETTERS,nrow(se2),replace = TRUE)
se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE, mergeBy="metafeature", aggFun=median)## Aggregating the objects by metafeature## Merging...sechm::sechm(se3, features=row.names(se3))A single SE can also be aggregated by using the aggSE function:
se1b <- aggSE(se1, by = "metafeature")## Aggregation methods for each assay:
## counts: sum; logcpm: expsumse1b## class: SummarizedExperiment 
## dim: 24 10 
## metadata(0):
## assays(2): counts logcpm
## rownames(24): A B ... Y Z
## rowData names(0):
## colnames(10): HC.Homecage.1 HC.Homecage.2 ... HC.Handling.4
##   HC.Handling.5
## colData names(2): Region ConditionIf the aggregation function(s) are not specified, aggSE will try to guess decent aggregation functions from the assay names.
To facilitate plotting features with ggplot2, the meltSE function combines assay values along with row/column data:
d <- meltSE(SE, genes=row.names(SE)[1:4])
head(d)##   feature        sample Region Condition counts    logcpm
## 1    Egr1 HC.Homecage.1     HC  Homecage 1581.0 4.4284969
## 2   Nr4a1 HC.Homecage.1     HC  Homecage  750.0 3.6958917
## 3     Fos HC.Homecage.1     HC  Homecage   91.4 1.7556317
## 4    Egr2 HC.Homecage.1     HC  Homecage   15.1 0.5826999
## 5    Egr1 HC.Homecage.2     HC  Homecage 1423.0 4.4415828
## 6   Nr4a1 HC.Homecage.2     HC  Homecage  841.0 3.9237691suppressPackageStartupMessages(library(ggplot2))
ggplot(d, aes(Condition, counts, fill=Condition)) + geom_violin() + 
    facet_wrap(~feature, scale="free")
Figure 1: An example ggplot created from a melted SE
Calculate an assay of log-foldchanges to the controls:
SE <- log2FC(SE, fromAssay="logcpm", controls=SE$Condition=="Homecage")## R version 4.2.0 RC (2022-04-19 r82224)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.15-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.15-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ggplot2_3.3.5               SEtools_1.10.0             
##  [3] SummarizedExperiment_1.26.0 Biobase_2.56.0             
##  [5] GenomicRanges_1.48.0        GenomeInfoDb_1.32.0        
##  [7] IRanges_2.30.0              S4Vectors_0.34.0           
##  [9] BiocGenerics_0.42.0         MatrixGenerics_1.8.0       
## [11] matrixStats_0.62.0          BiocStyle_2.24.0           
## 
## loaded via a namespace (and not attached):
##   [1] Rtsne_0.16             colorspace_2.0-3       rjson_0.2.21          
##   [4] ellipsis_0.3.2         circlize_0.4.14        XVector_0.36.0        
##   [7] GlobalOptions_0.1.2    clue_0.3-60            farver_2.1.0          
##  [10] bit64_4.0.5            AnnotationDbi_1.58.0   fansi_1.0.3           
##  [13] codetools_0.2-18       splines_4.2.0          doParallel_1.0.17     
##  [16] cachem_1.0.6           sechm_1.4.0            geneplotter_1.74.0    
##  [19] knitr_1.38             jsonlite_1.8.0         Cairo_1.5-15          
##  [22] annotate_1.74.0        cluster_2.1.3          png_0.1-7             
##  [25] BiocManager_1.30.17    compiler_4.2.0         httr_1.4.2            
##  [28] assertthat_0.2.1       Matrix_1.4-1           fastmap_1.1.0         
##  [31] limma_3.52.0           cli_3.3.0              htmltools_0.5.2       
##  [34] tools_4.2.0            gtable_0.3.0           glue_1.6.2            
##  [37] GenomeInfoDbData_1.2.8 dplyr_1.0.8            V8_4.1.0              
##  [40] Rcpp_1.0.8.3           jquerylib_0.1.4        vctrs_0.4.1           
##  [43] Biostrings_2.64.0      nlme_3.1-157           iterators_1.0.14      
##  [46] xfun_0.30              stringr_1.4.0          openxlsx_4.2.5        
##  [49] lifecycle_1.0.1        XML_3.99-0.9           edgeR_3.38.0          
##  [52] zlibbioc_1.42.0        scales_1.2.0           TSP_1.2-0             
##  [55] parallel_4.2.0         RColorBrewer_1.1-3     ComplexHeatmap_2.12.0 
##  [58] yaml_2.3.5             curl_4.3.2             memoise_2.0.1         
##  [61] sass_0.4.1             stringi_1.7.6          RSQLite_2.2.12        
##  [64] highr_0.9              randomcoloR_1.1.0.1    genefilter_1.78.0     
##  [67] foreach_1.5.2          seriation_1.3.5        zip_2.2.0             
##  [70] BiocParallel_1.30.0    shape_1.4.6            rlang_1.0.2           
##  [73] pkgconfig_2.0.3        bitops_1.0-7           evaluate_0.15         
##  [76] lattice_0.20-45        purrr_0.3.4            labeling_0.4.2        
##  [79] bit_4.0.4              tidyselect_1.1.2       magrittr_2.0.3        
##  [82] bookdown_0.26          DESeq2_1.36.0          R6_2.5.1              
##  [85] magick_2.7.3           generics_0.1.2         DelayedArray_0.22.0   
##  [88] DBI_1.1.2              withr_2.5.0            mgcv_1.8-40           
##  [91] pillar_1.7.0           survival_3.3-1         KEGGREST_1.36.0       
##  [94] RCurl_1.98-1.6         tibble_3.1.6           crayon_1.5.1          
##  [97] utf8_1.2.2             rmarkdown_2.14         GetoptLong_1.0.5      
## [100] locfit_1.5-9.5         grid_4.2.0             sva_3.44.0            
## [103] data.table_1.14.2      blob_1.2.3             digest_0.6.29         
## [106] xtable_1.8-4           munsell_0.5.0          registry_0.5-1        
## [109] bslib_0.3.1