coef.ecpc: Obtain coefficients from 'ecpc' object
In ecpc: Flexible Co-Data Learning for High-Dimensional Prediction

coef.ecpc

R Documentation

Obtain coefficients from 'ecpc' object

Description

Obtain regression coefficients or penalties from an existing model fit given in an 'ecpc' object, re-estimate regression coefficients for a given 'ecpc' object and ridge penalties, or obtain ridge penalties for given prior parameters and co-data.

Usage

## S3 method for class 'ecpc'
coef(object, penalties = NULL, 
          X = NULL, Y = NULL, 
          unpen = NULL, intrcpt = TRUE, 
          model = c("linear", "logistic", "cox"), 
          est_beta_method = c("glmnet", "multiridge"), ...)

penalties(object, tauglobal=NULL, sigmahat=NULL, gamma=NULL, gamma0=NULL, w=NULL,
          Z=NULL, groupsets=NULL,
          unpen=NULL, datablocks=NULL)

Arguments

`object`	An 'ecpc' object returned by `ecpc`.
`penalties`	Ridge penalties; p-dimensional vector. If provided to `coef.ecpc`, 'X' and 'Y' should be provided too.
`tauglobal`	Estimated global prior variance; scalar (or vector with datatype-specific global prior variances when multiple ‘datablocks’ are given).) If provided to `penalties`, 'Z' or 'groupsets' should be provided too.
`sigmahat`	(linear model) Estimated sigma^2. If provided to `penalties`, 'Z' or 'groupsets' should be provided too.
`gamma`	Estimated co-data variable weights; vector of dimension the total number of groups. If provided to `penalties`, 'Z' or 'groupsets' should be provided too.
`gamma0`	Estimated co-data variable intercept; scalar. If provided to `penalties`, 'Z' or 'groupsets' should be provided too.
`w`	Estimated group set weights; m-dimensional vector. If provided to `penalties`, 'Z' or 'groupsets' should be provided too.
`X`	Observed data; (nxp)-dimensional matrix (p: number of covariates) with each row the observed high-dimensional feature vector of a sample.
`Y`	Response data; n-dimensional vector (n: number of samples) for linear and logistic outcomes, or `Surv` object for Cox survival.
`Z`	List with m co-data matrices. Each element is a (pxG)-dimensional co-data matrix containing co-data on the p variables. Co-data should either be provided in ‘Z’ or ‘groupsets’.
`groupsets`	Co-data group sets; list with m (m: number of group sets) group sets. Each group set is a list of all groups in that set. Each group is a vector containing the indices of the covariates in that group.
`unpen`	Unpenalised covariates; vector with indices of covariates that should not be penalised.
`intrcpt`	Should an intercept be included? Included by default for linear and logistic, excluded for Cox for which the baseline hazard is estimated.
`model`	Type of model for the response; linear, logistic or cox.
`est_beta_method`	Package used for estimating regression coefficients, either "glmnet" or "multiridge".
`datablocks`	(optional) for multiple data types, the corresponding blocks of data may be given in datablocks; a list of B vectors of the indices of covariates in ‘X’ that belong to each of the B data blocks. Unpenalised covariates should not be given as seperate block, but can be omitted or included in blocks with penalised covariates. Each datatype obtains a datatype-specific ‘tauglobal’ as in multiridge.
`...`	Other parameters

Value

For coef.ecpc, a list with:

`intercept`	If included, the estimated intercept; scalar.
`beta`	Estimated regression coefficients; p-dimensional vector.

For penalties: a p-dimensional vector with ridge penalties.

Examples

 
#####################
# Simulate toy data #
#####################
p<-300 #number of covariates
n<-100 #sample size training data set
n2<-100 #sample size test data set

#simulate all betas i.i.d. from beta_k~N(mean=0,sd=sqrt(0.1)):
muBeta<-0 #prior mean
varBeta<-0.1 #prior variance
indT1<-rep(1,p) #vector with group numbers all 1 (all simulated from same normal distribution)

#simulate test and training data sets:
Dat<-simDat(n,p,n2,muBeta,varBeta,indT1,sigma=1,model='linear') 
str(Dat) #Dat contains centered observed data, response data and regression coefficients

###################
# Provide co-data #
###################
continuousCodata <- abs(Dat$beta) 
Z1 <- cbind(continuousCodata,sqrt(continuousCodata))

#setting 2: splines for informative continuous
Z2 <- createZforSplines(values=continuousCodata)
S1.Z2 <- createS(orderPen=2, G=dim(Z2)[2]) #create difference penalty matrix
Con2 <- createCon(G=dim(Z2)[2], shape="positive+monotone.i") #create constraints

#setting 3: 5 random groups
G <- 5
categoricalRandom <- as.factor(sample(1:G,p,TRUE))
#make group set, i.e. list with G groups:
groupsetRandom <- createGroupset(categoricalRandom)
Z3 <- createZforGroupset(groupsetRandom,p=p)
S1.Z3 <- createS(G=G, categorical = TRUE) #create difference penalty matrix
Con3 <- createCon(G=dim(Z3)[2], shape="positive") #create constraints

#fit ecpc for the three co-data matrices with following penalty matrices and constraints
#note: can also be fitted without paraPen and/or paraCon
Z.all <- list(Z1=Z1,Z2=Z2,Z3=Z3)
paraPen.all <- list(Z2=list(S1=S1.Z2), Z3=list(S1=S1.Z3))
paraCon <- list(Z2=Con2, Z3=Con3)

############
# Fit ecpc #
############
tic<-proc.time()[[3]]
fit <- ecpc(Y=Dat$Y,X=Dat$Xctd,
           Z = Z.all, paraPen = paraPen.all, paraCon = paraCon,
           model="linear",maxsel=c(5,10,15,20),
           Y2=Dat$Y2,X2=Dat$X2ctd)
toc <- proc.time()[[3]]-tic

#estimate coefficients for twice as large penalties
new_coefficients <- coef(fit, penalties=fit$penalties*2, X=Dat$Xctd, Y=Dat$Y)

#change some prior parameters and find penalties
gamma2 <- fit$gamma; gamma2[1:3] <- 1:3
new_penalties <- penalties(fit, gamma=gamma2, Z=Z.all)
new_coefficients2 <- coef(fit, penalties=new_penalties, X=Dat$Xctd, Y=Dat$Y)

ecpc documentation built on March 7, 2023, 6:46 p.m.