findReasonableLambdaHelper: Function to run on a dataset with not too much missing data...

Description Usage Arguments Value Note Author(s) Examples

View source: R/findReasonableLambdaHelper.R

Description

Expects a singly imputed dataset and fits a logistic LASSO so the user can pick a set that wil probably be interesting.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
  findReasonableLambdaHelper(ds, out, family = "binomial",
    showFirst = 20, showPlot = TRUE, type.measure = "auc",
    repsNeededForFirstOccurrence = 3,
    weights = rep(1, nrow(ds)), ..., verbosity = 0,
    minNumHigher = 20, minNumLower = 20, maxNumLower = 30,
    imputeDs2FitDsProperties = normalImputationConversion(),
    standardize = FALSE, nfolds = 10)

  ## S3 method for class 'LambdaHelper'
object[i, j, drop = TRUE]

  getLambdas(x, ...)

  ## S3 method for class 'lambdaregion'
getLambdas(x, ...)

  ## S3 method for class 'LambdaHelper'
getLambdas(x, ...)

Arguments

ds

dataset to investigate

out

outcome vector

family

see glmnet. Defaults to "binomial" (i.e. lasso penalized logistic regression).

showFirst

show the top coefficients (first showFirst occurring)

showPlot

if TRUE (the default), visually supports the decision

type.measure

see cv.glmnet

repsNeededForFirstOccurrence

How many times (i.e. for how many lambda values) must a coefficient be consecutively nonzero before we count it as "occurring"

weights

vector with weight to be assigned to each row of ds

...

passed on to plotex (if relevant)

verbosity

The higher this value, the more levels of progress and debug information is displayed (note: in R for Windows, turn off buffered output)

minNumHigher

How many lambdas higher than the optimum do you minimally want (if available)

minNumLower

How many lambdas lower than the optimum do you minimally want (if available)

maxNumLower

How many lambdas lower than the optimum do you maximally want

imputeDs2FitDsProperties

see imputeDs2FitDs and EMLasso

standardize

see glmnet. Defaults to FALSE.

nfolds

see glmnet. Defaults to 10.

object

LambdaHelper

i

row index

j

column index. If this is missing, the ith lambda is returned

drop

if TRUE the result is coerced to the simplest structure possible

x

object to find 'interesting' set of lambdas for

Value

list of class "LambdaHelper":

topres

data.frame with showFirst rows, and columns: variable (name), lambda,critl (lower bound of criterion), crit (estimate of criterion), critu (upper bound of criterion), critsd (sd of criterion), index (at which lambda index does this variable first occur)

allLambda

vector of lambda values

regionDfr

data.frame w 3 rows 3 columns: name (values: "lower lambda", "optimum", and "higher lambda"), idx and lambda

regionOfInterestData

see getMinMaxPosLikeGlmnet

depends on the parameters

vector of lambda values, normally high to low

Note

EMLasso is pretty heavy and has to be run per lambda. This functions helps preselect some lambda values, and can typically avoid useless calculations for non-interesting lambda values.

Author(s)

Nick Sabbe nick.sabbe@ugent.be

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
aDfr<-generateTypicalIndependentDfr(numCat=10, numCnt=10, numObs=100, catProbs=rep(1/3,3),
rcnt=typicalRandomNorm, doShuffle=TRUE, verbosity=1)

outlins<- -mean(aDfr$cnt1)+aDfr$cnt1+2*(aDfr$cat1=="b")
outprobs<-expit(outlins)
y<-factor(sapply(outprobs, function(prob){sample(c("no", "yes"), 1, prob=c(1-prob,prob))}))

rlh<-findReasonableLambdaHelper(aDfr, y, verbosity=10)
data(emlcvfit, package="EMLasso")
rlh<-findReasonableLambdaHelper(aDfr, y, verbosity=10)
rlh[1]
rlh[1:5, NULL]
data(emlcvfit, package="EMLasso")
rlh<-findReasonableLambdaHelper(aDfr, y, verbosity=10)
getLambdas(rlh$regionOfInterestData)
data(emlcvfit, package="EMLasso")
rlh<-findReasonableLambdaHelper(aDfr, y, verbosity=10)
getLambdas(rlh)

EMLasso documentation built on May 2, 2019, 5:49 p.m.