The ToppGene Suite is a one-stop portal for gene list enrichment analysis and candidate gene prioritization based on functional annotations and protein interactions network. Although the ToppCluster web application provides convenient graphical access to the ToppGene Suite, the OpenAPI 3.0 compliant interface of ToppGene is better suited for automation and reproducibility. This package was initial generated from OpenAPI Generator and supplemented with Bioconductor class interfaces and more relevant biological examples.
toppgene 0.99.1
The toppgene package is a client for the ToppGene Suite webserver that takes as input a gene list to perform enrichment analysis.
To demonstrate the use of ToppGene, below are the two test cases from the publication (Chen et al. 2007) of congenital heart disease (CHD) and diabetic retinopathy (DR).
To install this package, start R and enter:
if (! require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("toppgene")
A query may contain one or more genes.
ToppGene enrich() requires gene Entrez ID integers.
However, symbol conversion with ToppGene
is more permissive than Bioconductor,
therefore use ToppGene’s lookup() function
to convert gene symbols to Entrez IDs.
The published example provides gene symbols for CHD (n = 28) and DR (n = 27)
that we will also use here.
genes_chd_sym <- c(
"ADD1", "CITED2", "DTNA", "CKM", "GATA4", "GJA1", "HAND1", "HAND2", "HEY2",
"HOXC4", "HOXC5", "ITGB3", "JARID2", "MTHFD1", "MTHFR", "MTRR", "NKX2-5",
"NOS3", "NPPA", "NPPB", "RFC1", "SALL4", "TBX1", "TBX5", "TBX20",
"TGFB1", "ZFPM2", "ZIC3")
genes_dr_sym <- c(
"ACE", "ADRB3", "AGT", "AGTR2", "AKR1B1", "APOE", "AR", "CMA1", "EDN1",
"GNB3", "HFE", "HLA-DPB1", "HLA-DRB1", "ICAM1", "ITGA2B", "ITGB2", "LTA",
"NOS2A", "NOS3", "NPY", "PECAM1", "PON1", "RAGE", "SELE", "SERPINE1",
"TIMP3", "TNF")
library(toppgene)
genes_chd <- lookup(genes_chd_sym)
genes_chd
#> DataFrame with 28 rows and 4 columns
#> OfficialSymbol Entrez Submitted Description
#> <character> <integer> <character> <character>
#> 1 ADD1 118 ADD1 adducin 1
#> 2 CITED2 10370 CITED2 Cbp/p300 interacting..
#> 3 DTNA 1837 DTNA dystrobrevin alpha
#> 4 CKM 1158 CKM creatine kinase, M-t..
#> 5 GATA4 2626 GATA4 GATA binding protein 4
#> ... ... ... ... ...
#> 24 TBX5 6910 TBX5 T-box transcription ..
#> 25 TBX20 57057 TBX20 T-box transcription ..
#> 26 TGFB1 7040 TGFB1 transforming growth ..
#> 27 ZFPM2 23414 ZFPM2 zinc finger protein,..
#> 28 ZIC3 7547 ZIC3 Zic family member 3
genes_dr <- lookup(genes_dr_sym)
genes_dr
#> DataFrame with 27 rows and 4 columns
#> OfficialSymbol Entrez Submitted Description
#> <character> <integer> <character> <character>
#> 1 ACE 1636 ACE angiotensin I conver..
#> 2 ADRB3 155 ADRB3 adrenoceptor beta 3
#> 3 AGT 183 AGT angiotensinogen
#> 4 AGTR2 186 AGTR2 angiotensin II recep..
#> 5 AKR1B1 231 AKR1B1 aldo-keto reductase ..
#> ... ... ... ... ...
#> 23 AGER 177 RAGE advanced glycosylati..
#> 24 SELE 6401 SELE selectin E
#> 25 SERPINE1 5054 SERPINE1 serpin family E memb..
#> 26 TIMP3 7078 TIMP3 TIMP metallopeptidas..
#> 27 TNF 7124 TNF tumor necrosis factor
enrich_chd <- enrich(genes_chd$Entrez)
enrich_chd
#> DataFrame with 1383 rows and 15 columns
#> Category ID Name
#> <character> <character> <character>
#> 1 GeneOntologyMolecula.. GO:0008134 transcription factor..
#> 2 GeneOntologyMolecula.. GO:0061629 RNA polymerase II-sp..
#> 3 GeneOntologyMolecula.. GO:0140297 DNA-binding transcri..
#> 4 GeneOntologyMolecula.. GO:0001228 DNA-binding transcri..
#> 5 GeneOntologyMolecula.. GO:0001216 DNA-binding transcri..
#> ... ... ... ...
#> 1379 Disease DOID:2841 (is_implic.. asthma (is_implicate..
#> 1380 Disease C1449563 Cardiomyopathy, Fami..
#> 1381 Disease EFO_0006340, EFO_000.. mean arterial pressu..
#> 1382 Disease EFO_0000612 myocardial infarction
#> 1383 Disease DOID:13550 (is_impli.. angle-closure glauco..
#> PValue QValueFDRBH QValueFDRBY QValueBonferroni TotalGenes
#> <numeric> <numeric> <numeric> <numeric> <integer>
#> 1 6.31374e-12 1.37639e-09 8.20882e-09 1.37639e-09 19978
#> 2 1.66715e-10 1.46986e-08 8.76626e-08 3.63440e-08 19978
#> 3 2.02275e-10 1.46986e-08 8.76626e-08 4.40959e-08 19978
#> 4 1.73749e-08 8.82533e-07 5.26343e-06 3.78774e-06 19978
#> 5 2.02416e-08 8.82533e-07 5.26343e-06 4.41267e-06 19978
#> ... ... ... ... ... ...
#> 1379 1.42795e-05 0.000230108 0.00182283 0.0220904 29516
#> 1380 1.45430e-05 0.000231939 0.00183733 0.0224981 29516
#> 1381 1.63772e-05 0.000258526 0.00204794 0.0253356 29516
#> 1382 1.78933e-05 0.000279606 0.00221493 0.0276810 29516
#> 1383 1.81704e-05 0.000281097 0.00222674 0.0281097 29516
#> GenesInTerm GenesInQuery GenesInTermInQuery Source
#> <integer> <integer> <integer> <character>
#> 1 754 28 13
#> 2 427 28 10
#> 3 595 28 11
#> 4 504 28 9
#> 5 513 28 9
#> ... ... ... ... ...
#> 1379 157 28 4 AllianceGenome
#> 1380 50 28 3 DisGeNET Curated
#> 1381 52 28 3 GWAS
#> 1382 350 28 5 GWAS
#> 1383 7 28 2 AllianceGenome
#> URL GenesEntrez GenesSymbol
#> <character> <IntegerList> <CharacterList>
#> 1 10370,2626,23493,... CITED2,GATA4,HEY2,...
#> 2 10370,2626,23493,... CITED2,GATA4,HEY2,...
#> 3 10370,2626,23493,... CITED2,GATA4,HEY2,...
#> 4 2626,1482,9421,... GATA4,NKX2-5,HAND1,...
#> 5 2626,1482,9421,... GATA4,NKX2-5,HAND1,...
#> ... ... ... ...
#> 1379 https://fms.alliance.. 7040,3690,4524,... TGFB1,ITGB3,MTHFR,...
#> 1380 1482,4878,4879 NKX2-5,NPPA,NPPB
#> 1381 http://www.ebi.ac.uk.. 2626,3221,4524 GATA4,HOXC4,MTHFR
#> 1382 http://www.ebi.ac.uk.. 7040,1482,57057,... TGFB1,NKX2-5,TBX20,...
#> 1383 https://fms.alliance.. 4524,4846 MTHFR,NOS3
enrich_dr <- enrich(genes_dr$Entrez)
enrich_dr
#> DataFrame with 1353 rows and 15 columns
#> Category ID Name
#> <character> <character> <character>
#> 1 GeneOntologyMolecula.. GO:0042277 peptide binding
#> 2 GeneOntologyMolecula.. GO:0004888 transmembrane signal..
#> 3 GeneOntologyMolecula.. GO:0034617 tetrahydrobiopterin ..
#> 4 GeneOntologyMolecula.. GO:0030545 signaling receptor r..
#> 5 GeneOntologyMolecula.. GO:0042605 peptide antigen bind..
#> ... ... ... ...
#> 1349 Disease DOID:1070 (is_implic.. primary open angle g..
#> 1350 Disease DOID:224 (biomarker_.. transient cerebral i..
#> 1351 Disease C0036690 Septicemia
#> 1352 Disease C0243026 Sepsis
#> 1353 Disease C1719672 Severe Sepsis
#> PValue QValueFDRBH QValueFDRBY QValueBonferroni TotalGenes
#> <numeric> <numeric> <numeric> <numeric> <integer>
#> 1 1.07728e-07 2.97329e-05 0.000184327 2.97329e-05 19978
#> 2 8.08462e-06 9.69324e-04 0.006009253 2.23136e-03 19978
#> 3 1.05361e-05 9.69324e-04 0.006009253 2.90797e-03 19978
#> 4 2.11526e-05 1.45953e-03 0.009048234 5.83811e-03 19978
#> 5 3.43071e-05 1.89375e-03 0.011740162 9.46875e-03 19978
#> ... ... ... ... ... ...
#> 1349 4.85723e-09 1.42681e-07 1.21591e-06 1.36974e-05 29516
#> 1350 5.55304e-09 1.61045e-07 1.37241e-06 1.56596e-05 29516
#> 1351 5.82504e-09 1.61045e-07 1.37241e-06 1.64266e-05 29516
#> 1352 5.82504e-09 1.61045e-07 1.37241e-06 1.64266e-05 29516
#> 1353 5.82504e-09 1.61045e-07 1.37241e-06 1.64266e-05 29516
#> GenesInTerm GenesInQuery GenesInTermInQuery Source
#> <integer> <integer> <integer> <character>
#> 1 299 27 7
#> 2 1407 27 10
#> 3 4 27 2
#> 4 662 27 7
#> 5 47 27 3
#> ... ... ... ... ...
#> 1349 23 27 4 AllianceGenome
#> 1350 157 27 6 AllianceGenome
#> 1351 24 27 4 DisGeNET Curated
#> 1352 24 27 4 DisGeNET Curated
#> 1353 24 27 4 DisGeNET Curated
#> URL GenesEntrez GenesSymbol
#> <character> <IntegerList> <CharacterList>
#> 1 3077,348,3689,... HFE,APOE,ITGB2,...
#> 2 6401,7124,155,... SELE,TNF,ADRB3,...
#> 3 4843,4846 NOS2,NOS3
#> 4 4049,7124,348,... LTA,TNF,APOE,...
#> 5 3077,3115,3123 HFE,HLA-DPB1,HLA-DRB1
#> ... ... ... ...
#> 1349 https://fms.alliance.. 5444,7124,348,... PON1,TNF,APOE,...
#> 1350 https://fms.alliance.. 7124,348,4846,... TNF,APOE,NOS3,...
#> 1351 4049,7124,4843,... LTA,TNF,NOS2,...
#> 1352 4049,7124,4843,... LTA,TNF,NOS2,...
#> 1353 4049,7124,4843,... LTA,TNF,NOS2,...
library(IRanges) # CharacterList
#> Loading required package: BiocGenerics
#> Loading required package: generics
#>
#> Attaching package: 'generics'
#> The following objects are masked from 'package:base':
#>
#> as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#> setequal, union
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#> as.data.frame, basename, cbind, colnames, dirname, do.call,
#> duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
#> mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#> rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
#> unsplit, which.max, which.min
#> Loading required package: S4Vectors
#> Loading required package: stats4
#>
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:utils':
#>
#> findMatches
#> The following objects are masked from 'package:base':
#>
#> I, expand.grid, unname
library(DFplyr) # (DataFrame support for various dplyr functions)
#> Loading required package: dplyr
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:IRanges':
#>
#> collapse, desc, intersect, setdiff, slice, union
#> The following objects are masked from 'package:S4Vectors':
#>
#> first, intersect, rename, setdiff, setequal, union
#> The following objects are masked from 'package:BiocGenerics':
#>
#> combine, intersect, setdiff, setequal, union
#> The following object is masked from 'package:generics':
#>
#> explain
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
#>
#> Attaching package: 'DFplyr'
#> The following object is masked from 'package:dplyr':
#>
#> desc
#> The following object is masked from 'package:IRanges':
#>
#> desc
## Show all DataFrame rows of top_results().
orig <- options(showHeadLines = 20L)
top_results <- function(df) {
df |>
group_by(Category) |>
slice(1) |>
ungroup() |>
## Shorten GeneOntology to GO.
mutate(Category = gsub(x = Category, "GeneOntology", "GO")) |>
select(Category, ID, Name, GenesSymbol)
}
enrich_chd |>
filter(any(GenesSymbol %in% CharacterList("HAND2"))) |>
top_results()
#> DataFrame with 18 rows and 4 columns
#> Category ID Name
#> <character> <character> <character>
#> 1 Coexpression M1938 MEISSNER_BRAIN_HCP_W..
#> 2 CoexpressionAtlas PCBC_ratio_EB_vs_SC_.. ratio_EmbryoidBody_v..
#> 3 Cytoband 4q33 4q33
#> 4 Disease C0039685 Tetralogy of Fallot
#> 5 Domain 4.10.280.10 -
#> 6 Drug ctd:D003474 Curcumin
#> 7 GeneFamily 420 Basic helix-loop-hel..
#> 8 GOBiologicalProcess GO:0048738 cardiac muscle tissu..
#> 9 GOCellularComponent GO:0000785 chromatin
#> 10 GOMolecularFunction GO:0008134 transcription factor..
#> 11 HumanPheno HP:0025015 Abnormal vascular mo..
#> 12 Interaction int:NKX2-5 NKX2-5 interactions
#> 13 MicroRNA hsa-miR-105:PITA hsa-miR-105:PITA_TOP
#> 14 MousePheno MP:0005294 abnormal heart ventr..
#> 15 Pathway M48011 REACTOME_CARDIOGENESIS
#> 16 Pubmed 25336743 Arid3b is essential ..
#> 17 TFBS V$FREAC7_01 V$FREAC7_01
#> 18 ToppCell ba7f7ce034c0f42742bf.. facs-Heart-LV-3m-Mes..
#> GenesSymbol
#> <CharacterList>
#> 1 GATA4,NKX2-5,HAND1,...
#> 2 GATA4,HAND1,NPPB,...
#> 3 HAND2
#> 4 CITED2,GATA4,NKX2-5,...
#> 5 HEY2,HAND1,HAND2
#> 6 TGFB1,GATA4,CKM,...
#> 7 HEY2,HAND1,HAND2
#> 8 TGFB1,CITED2,GATA4,...
#> 9 CITED2,GATA4,HEY2,...
#> 10 CITED2,GATA4,HEY2,...
#> 11 CITED2,GATA4,HEY2,...
#> 12 GATA4,JARID2,NKX2-5,...
#> 13 CITED2,HEY2,JARID2,...
#> 14 CITED2,GATA4,HEY2,...
#> 15 GATA4,HEY2,NKX2-5,...
#> 16 GATA4,HEY2,GJA1,...
#> 17 GATA4,NPPB,HOXC4,...
#> 18 CKM,NKX2-5,SALL4,...
enrich_dr |>
filter(any(GenesSymbol %in% CharacterList("HLA-DPB1"))) |>
top_results()
#> DataFrame with 16 rows and 4 columns
#> Category ID Name
#> <character> <character> <character>
#> 1 Coexpression M10454 MCLACHLAN_DENTAL_CAR..
#> 2 CoexpressionAtlas geo_heart_1000_K1 geo_heart_top-relati..
#> 3 Computational GAVISH_3CA_METAPROGR.. Genes upregulated in..
#> 4 Cytoband 6p21.3 6p21.3
#> 5 Disease DOID:2841 (is_implic.. asthma (is_implicate..
#> 6 Domain IPR003006 Ig/MHC_CS
#> 7 GeneFamily 591 C1-set domain contai..
#> 8 GOBiologicalProcess GO:0002684 positive regulation ..
#> 9 GOCellularComponent GO:0098552 side of membrane
#> 10 GOMolecularFunction GO:0042277 peptide binding
#> 11 HumanPheno HP:0100721 Mediastinal lymphade..
#> 12 Interaction int:KNG1 KNG1 interactions
#> 13 MicroRNA hsa-miR-4443
#> 14 Pathway M16476 KEGG_CELL_ADHESION_M..
#> 15 Pubmed 20668555 Extended LTA, TNF, L..
#> 16 ToppCell 2ae62c428728c1d9d447.. Transplant_Alveoli_a..
#> GenesSymbol
#> <CharacterList>
#> 1 SELE,APOE,ITGB2,...
#> 2 ITGB2,HLA-DPB1,HLA-DRB1
#> 3 SELE,HLA-DPB1,HLA-DRB1,...
#> 4 HFE,LTA,TNF,...
#> 5 LTA,TNF,ACE,...
#> 6 HFE,HLA-DPB1,AGER,...
#> 7 HFE,HLA-DPB1,HLA-DRB1
#> 8 SELE,LTA,TNF,...
#> 9 SELE,HFE,TNF,...
#> 10 HFE,APOE,ITGB2,...
#> 11 APOE,HLA-DPB1,HLA-DRB1
#> 12 ITGB2,HLA-DPB1,AGT
#> 13 LTA,ITGA2B,HLA-DPB1,...
#> 14 SELE,ITGB2,HLA-DPB1,...
#> 15 LTA,TNF,HLA-DPB1,...
#> 16 SELE,ACE,HLA-DPB1,...
options(showHeadLines = orig)
enrich_chd |>
lookup_pubchem()
#> DataFrame with 101 rows and 3 columns
#> Source ID CID
#> <character> <character> <character>
#> 1 CTD ctd:C007095 NA
#> 2 CTD ctd:C007350 15787
#> 3 CTD ctd:C026116 41781
#> 4 CTD ctd:C034587 56842157
#> 5 CTD ctd:C041125 NA
#> ... ... ... ...
#> 97 Stitch CID000070815 70815
#> 98 Stitch CID000168120 168120
#> 99 Stitch CID006450335 6450335
#> 100 Stitch CID005464096 5464096
#> 101 Stitch CID000023931 23931
enrich_dr |>
lookup_pubchem()
#> DataFrame with 100 rows and 3 columns
#> Source ID CID
#> <character> <character> <character>
#> 1 CTD ctd:C001803 137994
#> 2 CTD ctd:C003297 9677
#> 3 CTD ctd:C004479 NA
#> 4 CTD ctd:C007350 15787
#> 5 CTD ctd:C044946 NA
#> ... ... ... ...
#> 96 Stitch CID000001959 1959
#> 97 Stitch CID000003157 3157
#> 98 Stitch CID000071301 71301
#> 99 Stitch CID000000187 187
#> 100 Stitch CID000003715 3715
One can change the various cut-offs of a query using the
CategoriesDataFrame() to limit or expand the number of results.
## Default cut-offs.
cats <- CategoriesDataFrame()
cats
#> ToppGene CategoriesDataFrame with 19 categories
#> PValue MinGenes MaxGenes MaxResults Correction
#> Coexpression 0.05 1 1500 100 FDR
#> CoexpressionAtlas 0.05 1 1500 100 FDR
#> Computational 0.05 1 1500 100 FDR
#> Cytoband 0.05 1 1500 100 FDR
#> Disease 0.05 1 1500 100 FDR
#> Domain 0.05 1 1500 100 FDR
#> Drug 0.05 1 1500 100 FDR
#> GeneFamily 0.05 1 1500 100 FDR
#> GeneOntologyBiologicalProcess 0.05 1 1500 100 FDR
#> GeneOntologyCellularComponent 0.05 1 1500 100 FDR
#> GeneOntologyMolecularFunction 0.05 1 1500 100 FDR
#> HumanPheno 0.05 1 1500 100 FDR
#> Interaction 0.05 1 1500 100 FDR
#> MicroRNA 0.05 1 1500 100 FDR
#> MousePheno 0.05 1 1500 100 FDR
#> Pathway 0.05 1 1500 100 FDR
#> Pubmed 0.05 1 1500 100 FDR
#> TFBS 0.05 1 1500 100 FDR
#> ToppCell 0.05 1 1500 100 FDR
#> ------------------------------
#> Values allowed by ToppGene are:
#> PValue: [0, 1] <numeric>
#> MinGenes: [1, 5000] <integer>
#> MaxGenes: [2, 5000] <integer>
#> MaxResults: [1, 5000] <integer>
#> Correction: {None, FDR, Bonferroni} <character>
## Limit to 10 results for each category, and lower PValue for GeneOntology.
cats <-
cats |>
mutate(
PValue = case_when(
grepl("GeneOntology", rownames(cats)) ~ 1e-7,
.default = PValue),
MaxResults = 10L)
cats
#> ToppGene CategoriesDataFrame with 19 categories
#> PValue MinGenes MaxGenes MaxResults Correction
#> Coexpression 5e-02 1 1500 10 FDR
#> CoexpressionAtlas 5e-02 1 1500 10 FDR
#> Computational 5e-02 1 1500 10 FDR
#> Cytoband 5e-02 1 1500 10 FDR
#> Disease 5e-02 1 1500 10 FDR
#> Domain 5e-02 1 1500 10 FDR
#> Drug 5e-02 1 1500 10 FDR
#> GeneFamily 5e-02 1 1500 10 FDR
#> GeneOntologyBiologicalProcess 1e-07 1 1500 10 FDR
#> GeneOntologyCellularComponent 1e-07 1 1500 10 FDR
#> GeneOntologyMolecularFunction 1e-07 1 1500 10 FDR
#> HumanPheno 5e-02 1 1500 10 FDR
#> Interaction 5e-02 1 1500 10 FDR
#> MicroRNA 5e-02 1 1500 10 FDR
#> MousePheno 5e-02 1 1500 10 FDR
#> Pathway 5e-02 1 1500 10 FDR
#> Pubmed 5e-02 1 1500 10 FDR
#> TFBS 5e-02 1 1500 10 FDR
#> ToppCell 5e-02 1 1500 10 FDR
#> ------------------------------
#> Values allowed by ToppGene are:
#> PValue: [0, 1] <numeric>
#> MinGenes: [1, 5000] <integer>
#> MaxGenes: [2, 5000] <integer>
#> MaxResults: [1, 5000] <integer>
#> Correction: {None, FDR, Bonferroni} <character>
enrich_chd_mod <-
enrich(
genes_chd$Entrez,
cats)
enrich_chd_mod
#> DataFrame with 165 rows and 15 columns
#> Category ID Name
#> <character> <character> <character>
#> 1 GeneOntologyMolecula.. GO:0008134 transcription factor..
#> 2 GeneOntologyMolecula.. GO:0061629 RNA polymerase II-sp..
#> 3 GeneOntologyMolecula.. GO:0140297 DNA-binding transcri..
#> 4 GeneOntologyBiologic.. GO:0048738 cardiac muscle tissu..
#> 5 GeneOntologyBiologic.. GO:0007507 heart development
#> ... ... ... ...
#> 161 Disease C0018800 Cardiomegaly
#> 162 Disease C1383860 Cardiac Hypertrophy
#> 163 Disease C0019284 Diaphragmatic Hernia
#> 164 Disease DOID:5844 (is_implic.. myocardial infarctio..
#> 165 Disease MONDO_0015263 Brugada syndrome
#> PValue QValueFDRBH QValueFDRBY QValueBonferroni TotalGenes GenesInTerm
#> <numeric> <numeric> <numeric> <numeric> <integer> <integer>
#> 1 6.31374e-12 1.37639e-09 8.20882e-09 1.37639e-09 19978 754
#> 2 1.66715e-10 1.46986e-08 8.76626e-08 3.63440e-08 19978 427
#> 3 2.02275e-10 1.46986e-08 8.76626e-08 4.40959e-08 19978 595
#> 4 2.83683e-22 6.60698e-19 5.50403e-18 6.60698e-19 20557 326
#> 5 2.75671e-21 3.21019e-18 2.67429e-17 6.42039e-18 20557 764
#> ... ... ... ... ... ... ...
#> 161 1.11016e-12 2.45344e-10 1.94352e-09 1.71741e-09 29516 82
#> 162 1.11016e-12 2.45344e-10 1.94352e-09 1.71741e-09 29516 82
#> 163 1.80466e-12 3.48976e-10 2.76445e-09 2.79181e-09 29516 41
#> 164 4.30131e-12 7.39348e-10 5.85683e-09 6.65413e-09 29516 99
#> 165 6.06821e-12 9.38752e-10 7.43643e-09 9.38752e-09 29516 19
#> GenesInQuery GenesInTermInQuery Source URL
#> <integer> <integer> <character> <character>
#> 1 28 13
#> 2 28 10
#> 3 28 11
#> 4 28 16
#> 5 28 19
#> ... ... ... ... ...
#> 161 28 7 DisGeNET Curated
#> 162 28 7 DisGeNET Curated
#> 163 28 6 DisGeNET Curated
#> 164 28 7 AllianceGenome https://fms.alliance..
#> 165 28 5 GWAS http://purl.obolibra..
#> GenesEntrez GenesSymbol
#> <IntegerList> <CharacterList>
#> 1 10370,2626,23493,... CITED2,GATA4,HEY2,...
#> 2 10370,2626,23493,... CITED2,GATA4,HEY2,...
#> 3 10370,2626,23493,... CITED2,GATA4,HEY2,...
#> 4 7040,10370,2626,... TGFB1,CITED2,GATA4,...
#> 5 7040,10370,2626,... TGFB1,CITED2,GATA4,...
#> ... ... ...
#> 161 2626,1482,4878,... GATA4,NKX2-5,NPPA,...
#> 162 2626,1482,4878,... GATA4,NKX2-5,NPPA,...
#> 163 7040,2626,2697,... TGFB1,GATA4,GJA1,...
#> 164 7040,4878,4879,... TGFB1,NPPA,NPPB,...
#> 165 2626,23493,57057,... GATA4,HEY2,TBX20,...
## MaxResults limited to at most 10.
enrich_chd_mod |>
count(Category)
#> DataFrame with 18 rows and 2 columns
#> Category n
#> <factor> <integer>
#> 1 Coexpression 10
#> 2 CoexpressionAtlas 10
#> 3 Computational 2
#> 4 Cytoband 10
#> 5 Disease 10
#> ... ... ...
#> 14 MousePheno 10
#> 15 Pathway 10
#> 16 Pubmed 10
#> 17 TFBS 10
#> 18 ToppCell 10
## PValue limited to below 1e-7.
enrich_chd_mod |>
arrange(desc(PValue)) |>
filter(grepl(x = Category, "Onto")) |>
group_by(Category) |>
slice(1)
#> DataFrame with 2 rows and 15 columns
#> Category ID Name PValue
#> <character> <character> <character> <numeric>
#> 1 GeneOntologyBiologic.. GO:0003231 cardiac ventricle de.. 1.03508e-18
#> 2 GeneOntologyMolecula.. GO:0140297 DNA-binding transcri.. 2.02275e-10
#> QValueFDRBH QValueFDRBY QValueBonferroni TotalGenes GenesInTerm GenesInQuery
#> <numeric> <numeric> <numeric> <integer> <integer> <integer>
#> 1 2.36401e-16 1.96936e-15 2.41071e-15 20557 162 28
#> 2 1.46986e-08 8.76626e-08 4.40959e-08 19978 595 28
#> GenesInTermInQuery Source URL GenesEntrez
#> <integer> <character> <character> <IntegerList>
#> 1 12 7040,10370,2626,...
#> 2 11 10370,2626,23493,...
#> GenesSymbol
#> <CharacterList>
#> 1 TGFB1,CITED2,GATA4,...
#> 2 CITED2,GATA4,HEY2,...
sessionInfo()
#> R Under development (unstable) (2026-01-15 r89304)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] DFplyr_1.5.2 dplyr_1.2.0 IRanges_2.45.0
#> [4] S4Vectors_0.49.0 BiocGenerics_0.57.0 generics_0.1.4
#> [7] toppgene_0.99.1 BiocStyle_2.39.0
#>
#> loaded via a namespace (and not attached):
#> [1] rappdirs_0.3.4 sass_0.4.10 xml2_1.5.2
#> [4] RSQLite_2.4.6 hms_1.1.4 digest_0.6.39
#> [7] magrittr_2.0.4 evaluate_1.0.5 bookdown_0.46
#> [10] fastmap_1.2.0 blob_1.3.0 jsonlite_2.0.0
#> [13] DBI_1.3.0 BiocManager_1.30.27 purrr_1.2.1
#> [16] httr2_1.2.2 jquerylib_0.1.4 cli_3.6.5
#> [19] crayon_1.5.3 rlang_1.1.7 dbplyr_2.5.2
#> [22] bit64_4.6.0-1 withr_3.0.2 cachem_1.1.0
#> [25] yaml_2.3.12 otel_0.2.0 parallel_4.6.0
#> [28] tools_4.6.0 tzdb_0.5.0 memoise_2.0.1
#> [31] filelock_1.0.3 curl_7.0.0 vctrs_0.7.1
#> [34] R6_2.6.1 BiocFileCache_3.1.0 lifecycle_1.0.5
#> [37] bit_4.6.0 vroom_1.7.0 pkgconfig_2.0.3
#> [40] pillar_1.11.1 bslib_0.10.0 glue_1.8.0
#> [43] xfun_0.56 tibble_3.3.1 tidyselect_1.2.1
#> [46] knitr_1.51 htmltools_0.5.9 rmarkdown_2.30
#> [49] readr_2.2.0 compiler_4.6.0
Chen, Jing, Huan Xu, Bruce J. Aronow, and Anil G. Jegga. 2007. “Improved Human Disease Candidate Gene Prioritization Using Mouse Phenotype.” BMC Bioinformatics 8 (1): 392. https://doi.org/10.1186/1471-2105-8-392.