pblm.penalty: Auxiliary for specifying penalty terms in a 'pblm' model

View source: R/pblm_0.1-12.R

pblm.penaltyR Documentation

Auxiliary for specifying penalty terms in a pblm model

Description

This is an auxiliary function for specifying penalty terms in a pblm model.

Usage

pblm.penalty(pnl.type=c("none","ARC1","ARC2","ridge","lasso","lassoV", "equal"),
             lam1=NULL, lam2=NULL, lam3=NULL, lam4=NULL, 
             s1=NULL, s2=NULL, s3=NULL, s4=NULL, min.lam.fix=0.1, 
             constraints=FALSE, lamv1=1e10, lamv2=1e10)

Arguments

pnl.type

The type of penalty term to be used. By default "none" of them is used. See Details.

lam1, lam2

vectors of smoothing parameters for the marginals. By default they are zero. It is common to all the penalty terms implemented.

lam3

vector of smoothing parameters for the association. It is common to all the penalty terms implemented. If "ARC2" is selected, it will act on differences of by-row adjacent parameters.

lam4

vector smoothing parameters for the association. Specific for the "ARC2" penalty, for which the differences of by-column adjacent parameters are penalized

s1, s2, s3, s4

the orders of the difference operator. Specific for the "ARC2" penalty.

min.lam.fix

the minimum value for any penalty parameters in order to consider any lamdas smaller than this value as fixed. See details.

constraints

Should inequality constraints be applied to the marginal probabilities? This is done through a penalty term and intended for ordered resposes.

lamv1, lamv2

penalty parameters to be applied to both the marginal probabilities in order to mimicking inequality constraints.

Details

Some penalty terms implemented in pblm are described in Enea and Lovison (2014) and Enea and Attanasio (2015).

Just one penalty per model is allowed.

Penalty "ARC1" acts on first order differences of category-adjacent parameters. By increasing the smoothing parameters, the resulting marginal and/or association parameters, will tend to be equal among the categories. When the underlying contingency table cross-classifying the responses contains zero cells, This penalty may be useful to stabilize the estimates, for example to get a more "regular" association structure.

Penalty "ARC2" generalizes in a certain sense "ARC1", since it acts on high order differences of Ajacent Row and/or Column parameters, but it is maily used for ordered responses. For high smoothing values it constraints the marginal parameters to lie onto a polynomial curve, and/or the association structure to lie onto a polynomial surface. The degrees of the marginal polynomials are determined by s1-1, for the first marginal, and s2-1 for the second. The degree of the association polynomial surface is determined by s3+s4-2, in which s3-1 is the by-row polynomial degree and s4-1 the by-column one.

Penalty "ridge" constraints the regression parameters towards zero (horizontal penalty), so providing equal estimates for high penalty values.

The current implementation of the "lasso" and "lassoV" penalty terms is to be considerer provisional and needs to be better checked.

Penalty "lasso" is acts similarly to "ridge" and it is based on absolute values of the regression parameters.

Penalty "lassoV" vertically penalizes the absolute value of differences of adjacent row and column parameters. It is similar to ARC1. By increasing the smoothing parameter, the resulting marginal and/or association regression parameters, will tend to be equal among the response categories. This

Penalty "equal" constraints the marginal equations to be equal, so providing equal estimates. This could be useful, for example, in eyes or twins studies. The tuning parameter to be specified for this penalty is lam1.

Furthermore, if global logits are specified, inequality constraints on marginal predictors are mimicked by using penalty terms. Argument lamv1 is the penalty parameter for the marginal predictor of the first response, lamv2 is that for the second one.

Argument min.lam.fix is useful when an automatic selection of penalty parameters is desidered for certain lambdas only. The remaining lambdas can be either 0 or less than min.lam.fix, and excluded from the automatic selection. Fixing some lambdas to assume values in [0,min.lam.fix) may be useful, for example, for parameter space regularization.

Value

A list with the same arguments of the function, unless unlikely specified by the user.

Author(s)

Marco Enea marco.enea@unipa.it

References

Enea, M. and Attanasio, M. 2015. A model for bivariate data with application to the analysis of university students success. Journal of Applied Statistics, http://dx.doi.org/10.1080/02664763.2014.998407

Enea, M. and Lovison, G. 2019. A penalized approach for the bivariate logistic regression model with applications to social and medical data. Statistical Modelling, 19(5), 467-500.

See Also

pblm

Examples


#Example 1
# A British male sample on occupational status. 
data(bms)

# A third degree polynomial surface with equally-spaced integer scores
m1 <- pblm(fo1=cbind(fathers,sons)~1, data=bms, weights=bms$freq,
             penalty=pblm.penalty(pnl.type="ARC2",lam3=c(1e7), lam4=c(1e7), 
                                  s3=c(4), s4=c(4)))

require(lattice)
g <- expand.grid("sons"=1:6,"fathers"=1:6)
g$logGOR <- m1$coef[13:48]

oldpar <- par(no.readonly = TRUE) 
wireframe(logGOR ~ sons*fathers, data = g, zlim=c(min(g$logGOR-1),max(g$logGOR+1)), 
          scales = list(arrows = FALSE), screen = list(z = -130, x = -70),
          col.regions="magenta")
par(oldpar)


#Example 2

# an artificial data frame with two binary responses and two factors
set.seed(12)
da <- expand.grid("Y1"=0:1,"Y2"=0:1,"fat1"=0:1,"fat2"=0:1)
da$Freq <- sample(0:20,2*2*2*2,replace=TRUE)

# A quasi-independence model obtained by strongly penalizing the association intercept 
# through a ridge-type penalty term
m3 <- pblm(fo1=cbind(Y1,Y2) ~ fat1 + fat2, 
           fo12=~ 1, 
           data=da, weights=da$Freq, type="ss",
           proportional=pblm.prop(prop12=c(TRUE)),
           penalty=pblm.penalty(pnl.type="ridge",lam3=1e12))
summary(m3)

# notice that the last coefficient is not exactly zero
coef(m3)

m3.1 <- glm(Y1 ~ fat1 + fat2, data=da, weights=Freq, family=binomial)
m3.2 <- glm(Y2 ~ fat1 + fat2, data=da, weights=Freq, family=binomial)

all.equal(logLik(m3), logLik(m3.1)[1]+logLik(m3.2)[1])




pblm documentation built on June 19, 2025, 5:08 p.m.