The MgDb Class in the metagenomeFeatures package includes the sequences and taxonomic information for a 16S database. The following vignette demonstrates the class methods for exploring and subsetting a MgDb-class
object using the demoMgDb
included in the metagenomeFeatures
package. MgDb-class
object with full databases are inseparte packages such as the greengenes13.5MgDb
package.
MgDb-class
Objectlibrary(metagenomeFeatures)
demoMgDb <- get_demoMgDb()
demoMgDb
## MgDb object:[1] "Metadata"
## |ACCESSION_DATE: 3/31/2015
## |URL: https://greengenes.microbio.me
## |DB_TYPE_NAME: GreenGenes-MgDb-Demo
## |DB_TYPE_VALUE: MgDb
## |DB_SCHEMA_VERSION: 1.0
## [1] "Sequence Data:"
## A DNAStringSet instance of length 249
## width seq names
## [1] 1343 GACGAACGCTGGCGGCGTGC...TGAATACGTTCCCGGGCCT 1093016
## [2] 1326 GACGAACGCTGGCGGCGTGC...TGAATACGTTCCCGGGCCT 1083934
## [3] 1334 GATGAACGCTGGCGGCACGC...TGAATGCGTTCCCGGGCCT 1075456
## [4] 1345 GATGAACGCTAGCGGGAGGC...TGAATACGTTCCCGGGCCT 1023948
## [5] 1504 GACGAACGCTGGCGGCGCGC...GGGGTTGATGATTGGGGTG 983909
## ... ... ...
## [245] 1422 TCCGGTTGATCCTGCCGGAG...TCGAAACTGGGCCTCGCGA 4327819
## [246] 1419 CACTGCTATTGGAGTCCGAC...GGGGTTGCGTGAGGGGGGC 4344031
## [247] 1343 CGGTTGATCCTGCCGAAGGC...CCTTGCACACACCGCCCGT 4357608
## [248] 1270 TAACGTGAAGACCGGGATAA...CGAGCAGGTTTTAGGTGAG 4437875
## [249] 1554 TTTTTTCTGAGAATTTGATC...GGGCTGGATCACCTCCTTT 4485266
## [1] "Taxonomy Data:"
## Source: sqlite 3.8.6 [/private/tmp/Rtmp2ANki7/Rinst122f7237d7c49/metagenomeFeatures/extdata/demoTaxa.sqlite]
## From: taxa [249 x 8]
##
## Keys Kingdom Phylum Class
## (chr) (chr) (chr) (chr)
## 1 4324716 k__Bacteria p__Bacteroidetes c__
## 2 246960 k__Bacteria p__Planctomycetes c__028H05-P-BN-P5
## 3 222675 k__Bacteria p__Armatimonadetes c__0319-6E2
## 4 156874 k__Bacteria p__NC10 c__12-24
## 5 4383832 k__Bacteria p__GN02 c__3BR-5F
## 6 4383502 k__Bacteria p__Elusimicrobia c__4-29
## 7 315344 k__Bacteria p__Cyanobacteria c__4C0d-2
## 8 2655590 k__Bacteria p__GN04 c__5bav_B12
## 9 552241 k__Bacteria p__SBR1093 c__A712011
## 10 4327819 k__Archaea p__Crenarchaeota c__AAG
## .. ... ... ... ...
## Variables not shown: Order (chr), Family (chr), Genus (chr), Species (chr)
## [1] "Tree Data:"
##
## Phylogenetic tree with 203452 tips and 203451 internal nodes.
##
## Tip labels:
## 1018666, 421164, 989926, 892241, 1046178, 854915, ...
## Node labels:
## , k__Bacteria, , , , , ...
##
## Rooted; includes branch lengths.
taxa_keytypes
taxa_keytypes(demoMgDb)
## [1] "Keys" "Kingdom" "Phylum" "Class" "Order" "Family" "Genus"
## [8] "Species"
taxa_columns(demoMgDb)
## [1] "Keys" "Kingdom" "Phylum" "Class" "Order" "Family" "Genus"
## [8] "Species"
head(taxa_keys(demoMgDb, keytype = c("Kingdom")))
## Source: local data frame [6 x 1]
##
## Kingdom
## (chr)
## 1 k__Bacteria
## 2 k__Bacteria
## 3 k__Bacteria
## 4 k__Bacteria
## 5 k__Bacteria
## 6 k__Bacteria
Used to retrieve db entries for a specified taxanomic group or id list, can return either taxonomic, sequences information, or both.
select(demoMgDb, type = "taxa",
keys = c("Vibrio", "Salmonella"),
keytype = "Genus")
## Source: local data frame [0 x 8]
##
## Variables not shown: Keys (chr), Kingdom (chr), Phylum (chr), Class (chr),
## Order (chr), Family (chr), Genus (chr), Species (chr)
select(demoMgDb, type = "seq",
keys = c("Vibrio", "Salmonella"),
keytype = "Genus")
## A DNAStringSet instance of length 0
select(demoMgDb, type = "all",
keys = c("Vibrio", "Salmonella"),
keytype = "Genus")
## Warning in ape::drop.tip(tree, drop_tips): drop all tips of the tree:
## returning NULL
## $taxa
## Source: local data frame [0 x 8]
##
## Variables not shown: Keys (chr), Kingdom (chr), Phylum (chr), Class (chr),
## Order (chr), Family (chr), Genus (chr), Species (chr)
##
## $seq
## A DNAStringSet instance of length 0
sessionInfo()
## R version 3.3.0 (2016-05-03)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.9.5 (Mavericks)
##
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats4 parallel stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] magrittr_1.5 metagenomeSeq_1.14.2
## [3] RColorBrewer_1.1-2 glmnet_2.0-5
## [5] foreach_1.4.3 Matrix_1.2-6
## [7] limma_3.28.4 metagenomeFeatures_1.2.2
## [9] Biobase_2.32.0 Biostrings_2.40.0
## [11] XVector_0.12.0 IRanges_2.6.0
## [13] S4Vectors_0.10.0 BiocGenerics_0.18.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.5 formatR_1.4
## [3] GenomeInfoDb_1.8.2 bitops_1.0-6
## [5] iterators_1.0.8 tools_3.3.0
## [7] zlibbioc_1.18.0 digest_0.6.9
## [9] nlme_3.1-128 RSQLite_1.0.0
## [11] evaluate_0.9 lattice_0.20-33
## [13] DBI_0.4-1 yaml_2.1.13
## [15] dplyr_0.4.3 stringr_1.0.0
## [17] hwriter_1.3.2 knitr_1.13
## [19] caTools_1.17.1 gtools_3.5.0
## [21] grid_3.3.0 R6_2.1.2
## [23] BiocParallel_1.6.2 rmarkdown_0.9.6
## [25] gdata_2.17.0 latticeExtra_0.6-28
## [27] gplots_3.0.1 matrixStats_0.50.2
## [29] codetools_0.2-14 Rsamtools_1.24.0
## [31] htmltools_0.3.5 GenomicRanges_1.24.0
## [33] GenomicAlignments_1.8.0 ShortRead_1.30.0
## [35] assertthat_0.1 SummarizedExperiment_1.2.2
## [37] ape_3.4 KernSmooth_2.23-15
## [39] stringi_1.0-1 lazyeval_0.1.10