countPatientsInNet {netDx} | R Documentation |
Count number of patients in a network
countPatientsInNet(netDir, fList, ids)
netDir |
(char) dir with network set |
fList |
(char) filenames of interaction networks to count in |
ids |
(char) patient IDs to look for |
This functionality is needed to count patient overlap when input data is in a form that results in highly missing data, rather than when the same measures are available for almost all patients. An example application is when patient networks are based on unique genomic events in each patients (e.g. CNVs or indels), rather than 'full-matrix' data (e.g. questionnaires or gene expression matrices). The former scenario requires an update in the list of eligible networks each time some type of patient subsetting is applied (e.g. label enrichment, or train/test split). A matrix with patient/network membership serves as a lookup table to prune networks as feature selection proceeds
(matrix) Size P by N, where P is num patients and N is number of networks networks; a[i,j] =1 if patient i in network j, else 0
d <- tempdir() pids <- paste("P",1:5,sep="") m1 <- matrix(c("P1","P1","P2","P2","P3","P4",1,1,1), byrow=FALSE,ncol=3) write.table(m1, file=paste(d,"net1.txt",sep=getFileSep()),sep="\t", col.names=FALSE,row.names=FALSE,quote=FALSE) m2 <- matrix(c("P3","P4",1),nrow=1) write.table(m2, file=paste(d,"net2.txt",sep=getFileSep()),sep="\t", col.names=FALSE,row.names=FALSE,quote=FALSE) x <- countPatientsInNet(d,c("net1.txt","net2.txt"), pids)