RunCRE_HSAStringDB {QuaternaryProd} | R Documentation |
This function runs a causal relation engine by computing the Quaternary Dot Product Scoring Statistic, Ternary Dot Product Scoring Statistic or the Enrichment test over the Homo Sapien STRINGdb causal network (version 10 provided under the Creative Commons license: https://creativecommons.org/licenses/by/3.0/).
RunCRE_HSAStringDB(gene_expression_data, method = "Quaternary", fc.thresh = log2(1.3), pval.thresh = 0.05, only.significant.pvalues = FALSE, significance.level = 0.05, epsilon = 1e-16)
gene_expression_data |
A data frame for gene expression data. The |
method |
Choose one of |
fc.thresh |
Threshold for fold change in |
pval.thresh |
Threshold for p-values in |
only.significant.pvalues |
If |
significance.level |
When |
epsilon |
Threshold for probabilities of matrices. Default value is 1e-16. |
This function returns a data frame containing parameters concerning the method used. The p-values of each of the regulators is also computed, and the data frame is in increasing order of p-values of the goodness of fit score for the given regulators. The column names of the data frame are:
uid
The regulator in the STRINGdb network.
symbol
Symbol of the regulator.
regulation
Direction of regulation of the regulator.
correct.pred
Number of correct predictions in gene_expression_data
when compared to predictions made
by the network.
incorrect.pred
Number of incorrect predictions in gene_expression_data
when compared to predictions made
by the network.
score
The number of correct predictions minus the number of incorrect predictions.
total.reachable
Total Number of children of the given regulator.
significant.reachable
Number of children of the given regulator that are also present
in gene_expression_data
.
total.ambiguous
Total number of children of the given regulator which are regulated by the given regulator without
knowing the direction of regulation.
significant.ambiguous
Total number of children of the given regulator which are regulated by the given regulator without
knowing the direction of regulation and are also present in gene_expression_data
.
unknown
Number of target nodes in the STRINGdb causal network which do not interact with the given regulator.
pvalue
P-value of the score computed according to the selected method. If only.significant.pvalues = TRUE
and the pvalue
of the regulator is greater than significance.level
, then
the p-value is not computed and is set to a value of -1.
Carl Tony Fakhry, Ping Chen and Kourosh Zarringhalam
Carl Tony Fakhry, Parul Choudhary, Alex Gutteridge, Ben Sidders, Ping Chen, Daniel Ziemek, and Kourosh Zarringhalam. Interpreting transcriptional changes using causal graphs: new methods and their practical utility on public networks. BMC Bioinformatics, 17:318, 2016. ISSN 1471-2105. doi: 10.1186/s12859-016-1181-8.
Franceschini, A (2013). STRING v9.1: protein-protein interaction networks, with increased coverage and integration. In:'Nucleic Acids Res. 2013 Jan;41(Database issue):D808-15. doi: 10.1093/nar/gks1094. Epub 2012 Nov 29'.
# Get gene expression data e2f3 <- system.file("extdata", "e2f3_sig.txt", package = "QuaternaryProd") e2f3 <- read.table(e2f3, sep = "\t", header = TRUE, stringsAsFactors = FALSE) # Rename column names appropriately and remove duplicated entrez ids names(e2f3) <- c("entrez", "pvalue", "fc") e2f3 <- e2f3[!duplicated(e2f3$entrez),] # Compute the Quaternary Dot Product Scoring statistic for statistically significant # regulators in the STRINGdb network quaternary_results <- RunCRE_HSAStringDB(e2f3, method = "Quaternary", fc.thresh = log2(1.3), pval.thresh = 0.05, only.significant.pvalues = TRUE) # Get FDR corrected p-values quaternary_results["qvalue"] <- p.adjust(quaternary_results$pvalue, method = "fdr") quaternary_results[1:4, c("uid","symbol","regulation","pvalue","qvalue")]