fitSEMMS: Run the Generalized Alternating Maximization algorithm to fit...

Description Usage Arguments Value Examples

Description

Coefficients of predictors in a GLM are modeled as a three-component normal mixture, where the majority of predictors are assumed to belong to the null component (i.e., they have no effect on the mean of the response) and the others can have a positive, or negative effect. fitSEMMS runs a Generalized Alternating Maximization algorithm to fit the model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
fitSEMMS(
  dat,
  mincor = 0.7,
  nn = 5,
  nnset = NULL,
  distribution,
  rnd = F,
  BHthr = 0.01,
  initWithEdgeFinder = T,
  minchange = 1,
  maxst = 20,
  ptf = F,
  verbose = FALSE
)

Arguments

dat

The dataset, as generated by readInputFile().

mincor

The minimum correlation coefficient between pairs of putative variable, over which they are considered highly correlated. Default is 0.75.

nn

The initial value for the number of non-null variables. Default is 5.

nnset

Optional: instead of an initial number of candidates, can specify the column numbers in the Z matrix for the first iteration. Default is null.

distribution

The distribution of the response (N, P, or B).

rnd

Whether to run the greedy (F, default) algorithm, or the randomized.

BHthr

The Benjamini-Hochberg threshold to be used when determining the initial set of non-null variables. Default=0.01.

initWithEdgeFinder

Determines whether to use the edgefinder package to find highly correlated pairs of predictors (default=TRUE).

minchange

The minimum change in the log-likelihood that is to be considered meaningful. Default=1.

maxst

The maximum number of iterations of the algorithm. Default=20.

ptf

Whether to print output at each iteration to a file called SEMMS.log. Default=FALSE.

verbose

Whether to show progress message to the user. Default=FALSE.

Value

inits

The initial values obtained from initVals().

initNN

The column numbers of the putative variables to be included in the model in the first iteration.

gam.out

The output from the GAMupdate() function, which is a list containing nn=the selected variables, mu,beta,s2r,s2e=the parameter estimates of the mixture model, gam_nn=the sign (+/-1) of the effect of the non-null variables, pp=a table of posterior probabilities, and lockedOut=any variables found to be highly correlated with selected variables.

distribution

The distribution selected by the user.

mincor

The user's input for mincor above. Ignored if initWithEdgeFinder is set to TRUE

Examples

1
2
3
4
5
6
## Not run: 
fn <- system.file("extdata", "AR1SIM.RData", package = "SEMMS", mustWork = TRUE)
dataYXZ <- readInputFile(fn, ycol=1, Zcols=2:100)
fittedSEMMS <- fitSEMMS(dataYXZ, mincor=0.8, nn=15, minchange= 1,
                        distribution="N",verbose=T,rnd=F)
## End(Not run)

haimbar/SEMMS documentation built on Dec. 20, 2021, 2:44 p.m.