lr_modeling: Logistic Regression Predictive Models

Description Usage Arguments Value Examples

View source: R/lr_modeling.R

Description

The objective is - given the number of features, select the most informative ones and evaluate the predictive logistic regression model. The feature and model selection performed independently for each round of LOOCV. Feature selection performed using LASSO approach.

Usage

1
2
3
4
5
6
7
8
9
lr_modeling(
  msnset,
  features,
  response,
  pred.cls,
  K = NULL,
  sel.feat = T,
  par.backend = c("mc", "foreach", "none")
)

Arguments

msnset

MSnSet object. Note - can it be generalized to eset?

features

character vector features to select from for building prediction model. The features can be either in featureNames(msnset) or in pData(msnset).

response

factor to classify along. Must be only 2 levels.

pred.cls

character, class to predict

K

specifies the cross-validation type. Default NULL means LOOCV. Another typical value is 10.

sel.feat

logical to select features using LASSO or use the entire set?

par.backend

type of backend to support parallelizattion. 'mc' uses mclapply from parallel, 'foreach' is based on 'foreach', 'none' - just a single thread.

Value

list

prob

is the probabilities (response) from LOOCV that the sample is "case". That is how well model trained on other samples, predicts this particular one.

features

list of selected features for each iteration of LOOCV

top

top features over all iterations

auc

AUC

pred

prediction perfomance obtained by ROCR::prediction

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
data(srm_msnset)
head(varLabels(msnset))
head(msnset$subject.type)
# reduce to two classes
msnset <- msnset[,msnset$subject.type != "control.1"]
msnset$subject.type <- as.factor(msnset$subject.type)
# Note, par.backend="none" is for the example only.
out <- lr_modeling(msnset, 
                   features=featureNames(msnset), 
                   response="subject.type", 
                   pred.cls="case", par.backend="none")
plotAUC(out)
# top features consistently recurring in the models during LOOCV
print(out$top)
# the AUC
print(out$auc)
# probabilities of classifying the sample right, if the feature selection
# and model training was performed on other samples
plot(sort(out$prob))
abline(h=0.5, col='red')

vladpetyuk/vp.misc documentation built on June 25, 2021, 6:35 a.m.