Home

/

CRAN

/

groupRemMap.cv: Fit GroupRemMap models for a series of tuning parameters and...

groupRemMap.cv: Fit GroupRemMap models for a series of tuning parameters and...
In groupRemMap: Regularized Multivariate Regression for Identifying Master Predictors Using the GroupRemMap Penalty

Description Usage Arguments Details Value Author(s) References Examples

Fit GroupRemMap models for a series of tuning parameters and return the corresponding v-fold cross-validation scores. Two types of cross-validation scores are computed: cv based on unshrinked estimator (ols.cv); and cv based on shrinked estimator (rss.cv); ols.cv is recommended. rss.cv tends to select very large models and thus is not recommended in general (especially for very sparse models).

1	group.remmap.cv(X,Y,G,lam1.v, lam2.v,gamma=0.5, C.m=NULL,fold=10, seed=1)

`X`	numeric matrix (n by p): columns correspond to predictor variables and rows correspond to samples. Missing values are not allowed.
`G`	numeric (typically integer) vector of length p: the memberships of the predictors.
`Y`	numeric matrix (n by q): columns correspond to response variables and rows correspond to samples. Missing values are not allowed.
`lam1.v`	numeric value: the l_1 norm penalty parameter.
`lam2.v`	numeric value: the bridge penalty parameter.
`gamma`	numeric value: the bridge degree
`C.m`	numeric matrix (p by q): C_m[i,j]=0 means the corresponding coefficient beta[i,j] is set to be zero in the model;C_m[i,j]=1 means the corresponding beta[i,j] is included in the MAP penalty; C_m[i,j]=2 means the corresponding beta[i,j] is not included in the MAP penalty; the default(=NULL) sets all C_m[i,j] to be 1.
`fold`	numeric value: the number of folds in cross validation.
`seed`	numeric value: set the seed for creating the subsamples for cross valiation.

group.remmap.cv is used to perform two-dimensional grid search of the tuning parameters (lamL1.v, lamL2.v) based on v-fold cross-validation scores. (Wang et.al., 2013).

A list with four components

`ols.cv`	a numeric matrix recording the cross validation scores based on unshrinked OLS estimators for each pair of (lamL1, lamL2).
`rss.cv`	a numeric matrix recording the cross validation scores based on shrinked GroupRemMap estimators for each pair of (lamL1, lamL2).
`phi.cv`	a list recording the fitted GroupRemMap coefficients on cross validation training subsets. Each component corresponds to one cv fold and it is again a list with each component corresponding to the estimated GroupRemMap coefficients for one pair of (lamL1, lamL2) on that training subset.
`l.index`	numeric matrix with two rows: each column is a pair of (lamL1, lamL2) and the kth column corresponds to the kth cv score in as.vector(ols.cv) and as.vector(rss.cv).

X Wang, L Qin, H Zhang, Y Zhang, L Hsu, P Wang

X Wang, L Qin, H Zhang, Y Zhang, L Hsu, P Wang (2013) "A regularized multivariate regression approach for eQTL analysis" Statistics in Biosciences

############################################
############# Generate an example data set
############################################
n=20 
#number of predictors
p=9 
#number of response variables
q=4
set.seed(1)

###assume predictors falling into 3 groups, and the sizes of group 1
#group 2, and group 3 to be 3, 2, 4, respectively
G=c(1,1,1,2,2,3,3,3,3) 

### generate X matrix
rho=0.5; Sig<-matrix(0,p,p)
for(i in 2:p){ for(j in 1: (i-1)){
    Sig[i,j]<-rho^(abs(i-j))
    Sig[j,i]<-Sig[i,j]
}}
diag(Sig)<-1
R<-chol(Sig)
X.m<-matrix(rnorm(n*p),n,p)
X.m<-X.m


######################################################
#############Create coefficient matrix, B#############
######################################################
B=matrix(0, p, q)

set.seed(100)        
#number of predictors in each group
sz.group=c(3,2,4)
cumsum.group.sz=cumsum(sz.group)
#number of significant predictors in each group
num.sig.in.group=c(2,1,3) 

###--Generate coefficients for group 1, allowing 
###--every predictor in this group to predict the 
###--the same subset of responses
#number of significant preditors, 2
num.sig.x=num.sig.in.group[1]   
#numbers of responses predicted for each significant predictor
num.reg.y=c(3,1)
#indices of the significant predictors in group 1
ids.sig.x.tmp=sample(sz.group[1], num.sig.x)
#indices of all the responses that are predicted by predictors in group 1
ids.reg.y.tmp=sample(q, max(num.reg.y))
#indices of the responses that are predicted by each predictor in group 1
idxs.reg.y.tmp=sapply(num.reg.y, function (x) sample(ids.reg.y.tmp, x))

#Generate coefficient for each row of the coefficient matrix
for(idx.tmp in 1:num.sig.x)
{
    r=ids.sig.x.tmp[idx.tmp]
    c=idxs.reg.y.tmp[[idx.tmp]]
    B[r,c]=runif(num.reg.y[idx.tmp], min=1,max=4) #
}
#        

###--Generate coefficients for group 2

num.sig.x=num.sig.in.group[2]   
num.reg.y=2
ids.sig.x.tmp=sample(sz.group[2], num.sig.x)
ids.reg.y.tmp=sample(q, max(num.reg.y))
idxs.reg.y.tmp=list(as.numeric(sapply(num.reg.y, function (x) sample(ids.reg.y.tmp, x))))

#Generate coefficient for each row of the coefficient matrix
for(idx.tmp in 1:num.sig.x)
{
    r=cumsum.group.sz[1]+ids.sig.x.tmp[idx.tmp]
    c=idxs.reg.y.tmp[[idx.tmp]]

    B[r,c]=runif(num.reg.y[idx.tmp], min=1,max=4) #
}
#        



###--Generate coefficients for group 3,

num.sig.x=num.sig.in.group[3]   
num.reg.y=c(1,2,2)
ids.sig.x.tmp=sample(sz.group[3], num.sig.x)
ids.reg.y.tmp=sample(q, max(num.reg.y))
idxs.reg.y.tmp=sapply(num.reg.y, function (x) sample(ids.reg.y.tmp, x))

#Generate coefficient for each row of the coefficient matrix
for(idx.tmp in 1:num.sig.x)
{
    r=cumsum.group.sz[2]+ids.sig.x.tmp[idx.tmp]
    c=idxs.reg.y.tmp[[idx.tmp]]

    B[r, c]=runif(num.reg.y[idx.tmp], min=1,max=4) #
}
#        
 


##############################################
### generate responses
##############################################
E.m<-matrix(rnorm(n*q),n,q)
Y.m<-X.m


###############################################
############ perform analysis
##############################################

###############################################
## 1. ## fit model for one pair of (lamL1, lamL2)
###############################################

try1=group.remmap(X.m, Y.m, G=G, lam1=100, lam2=50, gamma=0.5, phi0=NULL, C.m=NULL)


###############################################
## 2. ## Select tuning parameters with v-fold cross-validation;
##    cv based on unshrinked estimator (ols.cv) is recommended over cv
##    based on shrinked estimator (rss.cv); 
###   ## the latter tends to select large models.
################################################

lamL1.v=exp(seq(log(8), log(15), length=5))
lamL2.v=seq(10, 20, length=5)
try2=group.remmap.cv(X=X.m, Y=Y.m, G=G, lamL1.v, lamL2.v, C.m=NULL, fold=10, seed=1)
 
############# use CV based on unshrinked estimator (ols.cv)
pick=which.min(as.vector(try2$ols.cv))
lamL1.pick=try2$l.index[1,pick]    ##find the optimal (LamL1,LamL2) based on the cv score
lamL2.pick=try2$l.index[2,pick]
##fit the GroupRemMap model under the optimal (LamL1,LamL2).
result=group.remmap(X.m, Y.m, G=G, lam1=lamL1.pick, lam2=lamL2.pick, phi0=NULL, C.m=NULL)  

## number of false positives at the individual predictor level
FP=sum(B[result$phi!=0]==0) 
## number of false negatives at the individual predictor level
FN=sum(B[result$phi==0]!=0)
##CV (unshrinked) selected tuning parameters
#print(paste("lamL1=", round(lamL1.pick,3), "; lamL2=", round(lamL2.pick,3), sep="")) 
print(paste("False Postive=", FP, "; False Negative=", FN, sep=""))

## number of errors at the group level
group.selection.matrix=aggregate(result$phi, list(G), function (x) any(x>=1e-6)+0)
true.group.selection.in.B=aggregate(B, list(G), function (x) any(x>=1e-3)+0)
## number of false positives at the group level
FP.group=sum(true.group.selection.in.B[,-1][group.selection.matrix[,-1]==1]==0)
## number of false negatives at the group level
FN.group=sum(true.group.selection.in.B[,-1][group.selection.matrix[,-1]==0]!=0)
print(paste("Group level FP=", FP.group, "; group level FN=", FN.group, sep=""))



############# use CV based on shrinked estimator (rss.cv); it tends to
#############select very large models: thus is not recommended in general
pick=which.min(as.vector(try2$rss.cv))
lamL1.pick=try2$l.index[1,pick]    ##find the optimal (LamL1,LamL2) based on the cv score
lamL2.pick=try2$l.index[2,pick]
##fit the GroupRemMap model under the optimal (LamL1,LamL2).
result=group.remmap(X.m, Y.m, G=G, lam1=lamL1.pick, lam2=lamL2.pick, phi0=NULL, C.m=NULL)
## number of false positives
FP=sum(B[result$phi!=0]==0) 
FN=sum(B[result$phi==0]!=0)  ## number of false negatives
##CV (shrinked) selected tuning parameters
#print(paste("lamL1=", round(lamL1.pick,3), "; lamL2=", round(lamL2.pick,3), sep="")) 
print(paste("False Positive=", FP, "; False Negative=", FN, sep=""))



## number of false positive at the group level
FP.group=sum(true.group.selection.in.B[,-1][group.selection.matrix[,-1]==1]==0)
## number of false positive at the group level
FN.group=sum(true.group.selection.in.B[,-1][group.selection.matrix[,-1]==0]!=0)
print(paste("Group level FP=", FP.group, "; group level FN=", FN.group, sep=""))

groupRemMap documentation built on May 2, 2019, 7:31 a.m.

groupRemMap index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

groupRemMap
Regularized Multivariate Regression for Identifying Master Predictors Using the GroupRemMap Penalty

groupRemMap.cv: Fit GroupRemMap models for a series of tuning parameters and...
In groupRemMap: Regularized Multivariate Regression for Identifying Master Predictors Using the GroupRemMap Penalty

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to groupRemMap.cv in groupRemMap...

R Package Documentation

Browse R Packages

We want your feedback!

groupRemMap Regularized Multivariate Regression for Identifying Master Predictors Using the GroupRemMap Penalty

groupRemMap.cv: Fit GroupRemMap models for a series of tuning parameters and... In groupRemMap: Regularized Multivariate Regression for Identifying Master Predictors Using the GroupRemMap Penalty

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to groupRemMap.cv in groupRemMap...

R Package Documentation

Browse R Packages

We want your feedback!

groupRemMap
Regularized Multivariate Regression for Identifying Master Predictors Using the GroupRemMap Penalty

groupRemMap.cv: Fit GroupRemMap models for a series of tuning parameters and...
In groupRemMap: Regularized Multivariate Regression for Identifying Master Predictors Using the GroupRemMap Penalty