logicFS {logicFS} | R Documentation |
Identification of interesting interactions between binary variables
using logic regression. Currently available for the classification, the linear
regression and the logistic regression approach of logreg
and for
a multinomial logic regression as implemented in mlogreg
.
## Default S3 method: logicFS(x, y, B = 100, useN = TRUE, ntrees = 1, nleaves = 8, glm.if.1tree = FALSE, replace = TRUE, sub.frac = 0.632, anneal.control = logreg.anneal.control(), onlyRemove = FALSE, prob.case = 0.5, addMatImp = TRUE, fast = FALSE, rand = NULL, ...) ## S3 method for class 'formula' logicFS(formula, data, recdom = TRUE, ...)
x |
a matrix consisting of 0's and 1's. Each column must correspond to a binary variable and each row to an observation. Missing values are not allowed. |
y |
a numeric vector or a factor specifying the values of a response for all the observations
represented in |
B |
an integer specifying the number of iterations. |
useN |
logical specifying if the number of correctly classified out-of-bag observations should
be used in the computation of the importance measure. If |
ntrees |
an integer indicating how many trees should be used. For a binary response: If For a continuous response: A linear regression model with For a categorical response: n.lev-1 logic regression models with |
nleaves |
a numeric value specifying the maximum number of leaves used
in all trees combined. For details, see the help page of the function |
glm.if.1tree |
if |
replace |
should sampling of the cases be done with replacement? If
|
sub.frac |
a proportion specifying the fraction of the observations that
are used in each iteration to build a classification rule if |
anneal.control |
a list containing the parameters for simulated annealing.
See the help of the function |
onlyRemove |
should in the single tree case the multiple tree measure be used? If |
prob.case |
a numeric value between 0 and 1. If the outcome of the
logistic regression, i.e.\ the predicted probability, for an observation is
larger than |
addMatImp |
should the matrix containing the improvements due to the prime implicants
in each of the iterations be added to the output? (For each of the prime implicants,
the importance is computed by the average over the |
fast |
should a greedy search (as implemented in |
rand |
numeric value. If specified, the random number generator will be set into a reproducible state. |
formula |
an object of class |
data |
a data frame containing the variables in the model. Each row of |
recdom |
a logical value or vector of length |
... |
for the |
An object of class logicFS
containing
primes |
the prime implicants, |
vim |
the importance of the prime implicants, |
prop |
the proportion of logic regression models that contain the prime implicants, |
type |
the type of model (1: classification, 2: linear regression, 3: logistic regression), |
param |
further parameters (if |
mat.imp |
the matrix containing the improvements if |
measure |
the name of the used importance measure, |
useN |
the value of |
threshold |
NULL, |
mu |
NULL. |
Holger Schwender, holger.schwender@udo.edu
Ruczinski, I., Kooperberg, C., LeBlanc M.L. (2003). Logic Regression. Journal of Computational and Graphical Statistics, 12, 475-511.
Schwender, H., Ickstadt, K. (2007). Identification of SNP Interactions Using Logic Regression. Biostatistics, 9(1), 187-198.
## Not run: # Load data. data(data.logicfs) # For logic regression and hence logic.fs, the variables must # be binary. data.logicfs, however, contains categorical data # with realizations 1, 2 and 3. Such data can be transformed # into binary data by bin.snps<-make.snp.dummy(data.logicfs) # To speed up the search for the best logic regression models # only a small number of iterations is used in simulated annealing. my.anneal<-logreg.anneal.control(start=2,end=-2,iter=10000) # Feature selection using logic regression is then done by log.out<-logicFS(bin.snps,cl.logicfs,B=20,nleaves=10, rand=123,anneal.control=my.anneal) # The output of logic.fs can be printed log.out # One can specify another number of interactions that should be # printed, here, e.g., 15. print(log.out,topX=15) # The variable importance can also be plotted. plot(log.out) # And the original variable names are displayed in plot(log.out,coded=FALSE) ## End(Not run)