addMotifAnnotation {RcisTarget} | R Documentation |
Select significant motifs and/or annotate motifs to transcription factors. The motifs are considered significantly enriched if they pass the the Normalized Enrichment Score (NES) threshold.
addMotifAnnotation(auc, nesThreshold = 3, digits = 3, motifAnnot = NULL, motifAnnot_highConfCat = c("directAnnotation", "inferredBy_Orthology"), motifAnnot_lowConfCat = c("inferredBy_MotifSimilarity", "inferredBy_MotifSimilarity_n_Orthology"), highlightTFs = NULL)
auc |
Output from calcAUC. |
nesThreshold |
Numeric. NES threshold to calculate the motif significant (3.0 by default). The NES is calculated -for each motif- based on the AUC distribution of all the motifs for the gene-set [(x-mean)/sd]. |
digits |
Integer. Number of digits for the AUC and NES in the output table. |
motifAnnot |
Motif annotation database containing the annotations of the motif to transcription factors. The names should match the ranking column names. |
motifAnnot_highConfCat |
Categories considered as source for 'high confidence' annotations. By default, "directAnnotation" (annotated in the source database), and "inferredBy_Orthology" (the motif is annotated to an homologous/ortologous gene). |
motifAnnot_lowConfCat |
Categories considered 'lower confidence' source for annotations. By default, the annotations inferred based on motif similarity ("inferredBy_MotifSimilarity", "inferredBy_MotifSimilarity_n_Orthology"). |
highlightTFs |
Character. If a list of transcription factors is provided, the column TFinDB in the otuput table will indicate whether any of those TFs are included within the 'high-confidence' annotation (two asterisks, **) or 'low-confidence' annotation (one asterisk, *) of the motif. The vector can be named to indicate which TF to highlight for each gene-set. Otherwise, all TFs will be used for all geneSets. |
data.table
with the folowing columns:
geneSet: Name of the gene set
motif: ID of the motif (colnames of the ranking, it might be other kind of feature)
NES: Normalized enrichment score of the motif in the gene-set
AUC: Area Under the Curve (used to calculate the NES)
TFinDB: Indicates whether the highlightedTFs are included within the high-confidence annotation (two asterisks, **) or lower-confidence annotation (one asterisk, *)
TF_highConf: Transcription factors annotated to the motif based on high-confidence annotations.
TF_lowConf: Transcription factors annotated to the motif according to based on lower-confidence annotations.
Next step in the workflow: addSignificantGenes
.
Previous step in the workflow: calcAUC
.
See the package vignette for examples and more details:
vignette("RcisTarget")
################################################## # Setup & previous steps in the workflow: #### Gene sets # As example, the package includes an Hypoxia gene set: txtFile <- paste(file.path(system.file('examples', package='RcisTarget')), "hypoxiaGeneSet.txt", sep="/") geneLists <- list(hypoxia=read.table(txtFile, stringsAsFactors=FALSE)[,1]) #### Databases ## Motif rankings: Select according to organism and distance around TSS ## (See the vignette for URLs to download) # motifRankings <- importRankings("hg19-500bp-upstream-7species.mc9nr.feather") ## For this example we will use a SUBSET of the ranking/motif databases: library(RcisTarget.hg19.motifDBs.cisbpOnly.500bp) data(hg19_500bpUpstream_motifRanking_cispbOnly) motifRankings <- hg19_500bpUpstream_motifRanking_cispbOnly ## Motif - TF annotation: data(motifAnnotations_hgnc) # human TFs (for motif collection 9) motifAnnotation <- motifAnnotations_hgnc ### Run RcisTarget # Step 1. Calculate AUC motifs_AUC <- calcAUC(geneLists, motifRankings) ################################################## ### (This step: Step 2) # Select significant motifs, add TF annotation & format as table motifEnrichmentTable <- addMotifAnnotation(motifs_AUC, motifAnnot=motifAnnotation) # Alternative: Modifying some options motifEnrichment_wIndirect <- addMotifAnnotation(motifs_AUC, nesThreshold=2, motifAnnot=motifAnnotation, highlightTFs = "HIF1A", motifAnnot_highConfCat=c("directAnnotation"), motifAnnot_lowConfCat=c("inferredBy_MotifSimilarity", "inferredBy_MotifSimilarity_n_Orthology", "inferredBy_Orthology"), digits=3) ### Exploring the output: # Number of enriched motifs (Over the given NES threshold) nrow(motifEnrichmentTable) # Interactive exploration motifEnrichmentTable <- addLogo(motifEnrichmentTable) DT::datatable(motifEnrichmentTable, filter="top", escape=FALSE, options=list(pageLength=50)) # Note: If using the fake database, the results of this analysis are meaningless # The object returned is a data.table (for faster computation), # which has a diferent syntax from the standard data.frame or matrix # Feel free to convert it to a data.frame (as.data.frame()) motifEnrichmentTable[,1:6] ################################################## # Next step (step 3, optional): ## Not run: motifEnrichmentTable_wGenes <- addSignificantGenes(motifEnrichmentTable, geneSets=geneLists, rankings=motifRankings, method="aprox") ## End(Not run)