metafuse: fit a GLM with fusion penalty for data integraion

Description Usage Arguments Details Value References Examples

View source: R/metafuse_functions.R

Description

Fit a GLM with fusion penalty on coefficients within each covariate across datasets, generate solution path and fusograms for visualization of the model selection.

Usage

1
2
3
metafuse(X = X, y = y, sid = sid, fuse.which = c(0:ncol(X)),
  family = "gaussian", intercept = TRUE, alpha = 0, criterion = "EBIC",
  verbose = TRUE, plots = FALSE, loglambda = TRUE)

Arguments

X

a matrix (or vector) of predictor(s), with dimensions of N*(p-1), where N is the total sample size of the integrated dataset

y

a vector of response, with length N; when family="cox", y is a data frame with cloumns time and status

sid

data source ID of length N, must contain integers numbered from 1 to K

fuse.which

a vector of integers from 0 to p-1, indicating which covariates are considered for fusion; 0 corresponds to the intercept; coefficients of covariates not in this vector are homogeneously estimated across all datasets

family

response vector type, "gaussian" if y is a continuous vector, "binomial" if y is binary vector, "poisson" if y is a count vector, "cox" if y is a data frame with cloumns time and status

intercept

if TRUE, intercept will be included, default is TRUE

alpha

the ratio of sparsity penalty to fusion penalty, default is 0 (i.e., no variable selection, only fusion)

criterion

"AIC" for AIC, "BIC" for BIC, "EBIC" for extended BIC,default is "BIC"

verbose

if TRUE, outputs whenever a fusion event happens, and returns the current value of lambda, default is TRUE

plots

if TRUE, create solution paths and fusogram plots to visualize the clustering of regression coefficients across datasets, default is FALSE

loglambda

if TRUE, lambda will be plotted in log-10 scale, default is TRUE

Details

Adaptive lasso penalty is used. See Zou (2006) for detail.

Value

A list containing the following items will be returned:

family

the response/model type

criterion

model selection criterion used

alpha

the ratio of sparsity penalty to fusion penalty

if.fuse

whether covariate is assumed to be heterogeneous (1) or homogeneous (0)

betahat

the estimated regression coefficients

betainfo

additional information about the fit, including degree of freedom, optimal lambda value, maximum lambda value to fuse all coefficients, and estimated friction of fusion

References

Lu Tang, and Peter X.K. Song. Fused Lasso Approach in Regression Coefficients Clustering - Learning Parameter Heterogeneity in Data Integration. Journal of Machine Learning Research, 17(113):1-23, 2016.

Fei Wang, Lu Wang, and Peter X.K. Song. Fused lasso with the adaptation of parameter ordering in combining multiple studies with repeated measurements. Biometrics, DOI:10.1111/biom.12496, 2016.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
########### generate data ###########
n <- 200    # sample size in each dataset (can also be a K-element vector)
K <- 10     # number of datasets for data integration
p <- 3      # number of covariates in X (including the intercept)

# the coefficient matrix of dimension K * p, used to specify the heterogeneous pattern
beta0 <- matrix(c(0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,   # beta_0 of intercept
                  0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,   # beta_1 of X_1
                  0.0,0.0,0.0,0.0,0.5,0.5,0.5,1.0,1.0,1.0),  # beta_2 of X_2
                K, p)

# generate a data set, family=c("gaussian", "binomial", "poisson", "cox")
data <- datagenerator(n=n, beta0=beta0, family="gaussian", seed=123)

# prepare the input for metafuse
y       <- data$y
sid     <- data$group
X       <- data[,-c(1,ncol(data))]

########### run metafuse ###########
# fuse slopes of X1 (which is heterogeneous with 2 clusters)
metafuse(X=X, y=y, sid=sid, fuse.which=c(1), family="gaussian", intercept=TRUE, alpha=0,
          criterion="EBIC", verbose=TRUE, plots=TRUE, loglambda=TRUE)

# fuse slopes of X2 (which is heterogeneous with 3 clusters)
metafuse(X=X, y=y, sid=sid, fuse.which=c(2), family="gaussian", intercept=TRUE, alpha=0,
          criterion="EBIC", verbose=TRUE, plots=TRUE, loglambda=TRUE)

# fuse all three covariates
metafuse(X=X, y=y, sid=sid, fuse.which=c(0,1,2), family="gaussian", intercept=TRUE, alpha=0,
          criterion="EBIC", verbose=TRUE, plots=TRUE, loglambda=TRUE)

# fuse all three covariates, with sparsity penalty
metafuse(X=X, y=y, sid=sid, fuse.which=c(0,1,2), family="gaussian", intercept=TRUE, alpha=1,
          criterion="EBIC", verbose=TRUE, plots=TRUE, loglambda=TRUE)

Example output

************************** verbose (start) **************************
Lambda = 0
Intercept        x1        x2 
        1        10         1 
Lambda = 1e-04
Intercept        x1        x2 
        1         9         1 
Lambda = 0.0025
Intercept        x1        x2 
        1         8         1 
Lambda = 0.0562
Intercept        x1        x2 
        1         7         1 
Lambda = 0.1259
Intercept        x1        x2 
        1         6         1 
Lambda = 0.1778
Intercept        x1        x2 
        1         5         1 
Lambda = 0.3981
Intercept        x1        x2 
        1         4         1 
Lambda = 1.122
Intercept        x1        x2 
        1         3         1 
Lambda = 1.5849
Intercept        x1        x2 
        1         2         1 
Lambda = 177.8279
Intercept        x1        x2 
        1         1         1 
************************** verbose (end) **************************
Press [enter] to continue
Press [enter] to continue
$family
[1] "gaussian"

$criterion
[1] "EBIC"

$alpha
[1] 0

$if.fuse
Intercept        x1        x2 
        0         1         0 

$betahat
           Intercept         x1        x2
study1  -0.009447474 -0.1067597 0.4443723
study2  -0.009447474 -0.1067597 0.4443723
study3  -0.009447474 -0.1067597 0.4443723
study4  -0.009447474 -0.1067597 0.4443723
study5  -0.009447474 -0.1067597 0.4443723
study6  -0.009447474  1.0552546 0.4443723
study7  -0.009447474  1.0552546 0.4443723
study8  -0.009447474  1.0552546 0.4443723
study9  -0.009447474  1.0552546 0.4443723
study10 -0.009447474  1.0552546 0.4443723

$betainfo
$betainfo$original_scale
            Intercept          x1       x2
DF           1.000000   2.0000000 1.000000
lambda_opt   2.511886   2.5118864 2.511886
lambda_fuse  0.000000 177.8279410 0.000000
friction          NaN   0.9858746      NaN

$betainfo$log10_scale
            Intercept        x1        x2
DF          1.0000000 2.0000000 1.0000000
lambda_opt  0.5455405 0.5455405 0.5455405
lambda_fuse 0.0000000 2.2524354 0.0000000
friction          NaN 0.7577997       NaN


************************** verbose (start) **************************
Lambda = 0
Intercept        x1        x2 
        1         1        10 
Lambda = 1e-04
Intercept        x1        x2 
        1         1         9 
Lambda = 0.0032
Intercept        x1        x2 
        1         1         8 
Lambda = 0.0079
Intercept        x1        x2 
        1         1         7 
Lambda = 0.0126
Intercept        x1        x2 
        1         1         6 
Lambda = 0.0282
Intercept        x1        x2 
        1         1         5 
Lambda = 0.1
Intercept        x1        x2 
        1         1         4 
Lambda = 0.2818
Intercept        x1        x2 
        1         1         3 
Lambda = 0.7943
Intercept        x1        x2 
        1         1         2 
Lambda = 2.8184
Intercept        x1        x2 
        1         1         1 
************************** verbose (end) **************************
Press [enter] to continue
Press [enter] to continue
$family
[1] "gaussian"

$criterion
[1] "EBIC"

$alpha
[1] 0

$if.fuse
Intercept        x1        x2 
        0         0         1 

$betahat
         Intercept        x1          x2
study1  0.02126244 0.4476072 -0.20473895
study2  0.02126244 0.4476072 -0.06921323
study3  0.02126244 0.4476072 -0.20473895
study4  0.02126244 0.4476072 -0.06921323
study5  0.02126244 0.4476072  0.52722701
study6  0.02126244 0.4476072  0.52722701
study7  0.02126244 0.4476072  0.77538250
study8  0.02126244 0.4476072  1.14221687
study9  0.02126244 0.4476072  1.14221687
study10 0.02126244 0.4476072  1.14221687

$betainfo
$betainfo$original_scale
             Intercept         x1         x2
DF          1.00000000 1.00000000 5.00000000
lambda_opt  0.02818383 0.02818383 0.02818383
lambda_fuse 0.00000000 0.00000000 2.81838293
friction           NaN        NaN 0.99000000

$betainfo$log10_scale
             Intercept         x1         x2
DF          1.00000000 1.00000000 5.00000000
lambda_opt  0.01207077 0.01207077 0.01207077
lambda_fuse 0.00000000 0.00000000 0.58187948
friction           NaN        NaN 0.97925555


************************** verbose (start) **************************
Lambda = 0
Intercept        x1        x2 
       10        10        10 
Lambda = 1e-04
Intercept        x1        x2 
        8         9        10 
Lambda = 4e-04
Intercept        x1        x2 
        8         9         9 
Lambda = 4e-04
Intercept        x1        x2 
        7         9         9 
Lambda = 6e-04
Intercept        x1        x2 
        6         9         9 
Lambda = 0.0013
Intercept        x1        x2 
        5         8         8 
Lambda = 0.0025
Intercept        x1        x2 
        4         8         8 
Lambda = 0.0028
Intercept        x1        x2 
        4         7         8 
Lambda = 0.0045
Intercept        x1        x2 
        4         7         7 
Lambda = 0.0071
Intercept        x1        x2 
        3         6         7 
Lambda = 0.01
Intercept        x1        x2 
        3         5         6 
Lambda = 0.0316
Intercept        x1        x2 
        3         5         5 
Lambda = 0.0355
Intercept        x1        x2 
        3         4         5 
Lambda = 0.0562
Intercept        x1        x2 
        3         3         4 
Lambda = 0.0708
Intercept        x1        x2 
        2         3         4 
Lambda = 0.0794
Intercept        x1        x2 
        1         3         4 
Lambda = 0.0891
Intercept        x1        x2 
        1         2         4 
Lambda = 0.1585
Intercept        x1        x2 
        1         2         3 
Lambda = 3.9811
Intercept        x1        x2 
        1         2         2 
Lambda = 5.6234
Intercept        x1        x2 
        1         2         1 
Lambda = 15.8489
Intercept        x1        x2 
        1         1         1 
************************** verbose (end) **************************
Press [enter] to continue
Press [enter] to continue
Press [enter] to continue
Press [enter] to continue
$family
[1] "gaussian"

$criterion
[1] "EBIC"

$alpha
[1] 0

$if.fuse
Intercept        x1        x2 
        1         1         1 

$betahat
          Intercept          x1           x2
study1  0.003334298 -0.01606108 -0.006970854
study2  0.003334298 -0.01606108 -0.006970854
study3  0.003334298 -0.01606108 -0.006970854
study4  0.003334298 -0.01606108 -0.006970854
study5  0.003334298 -0.01606108  0.513345365
study6  0.003334298  0.96328884  0.513345365
study7  0.003334298  0.96328884  0.513345365
study8  0.003334298  0.96328884  1.008996127
study9  0.003334298  0.96328884  1.008996127
study10 0.003334298  0.96328884  1.008996127

$betainfo
$betainfo$original_scale
             Intercept         x1        x2
DF          1.00000000  2.0000000 3.0000000
lambda_opt  0.15848932  0.1584893 0.1584893
lambda_fuse 0.07943282 15.8489319 5.6234133
friction    0.00000000  0.9900000 0.9718162

$betainfo$log10_scale
             Intercept         x1         x2
DF          1.00000000 2.00000000 3.00000000
lambda_opt  0.06389203 0.06389203 0.06389203
lambda_fuse 0.03319562 1.22657238 0.82108185
friction    0.00000000 0.94791010 0.92218555


************************** verbose (start) **************************
Lambda = 0
Intercept        x1        x2 
       10        10        10 
Lambda = 1e-04
Intercept        x1        x2 
        8         9        10 
Lambda = 4e-04
Intercept        x1        x2 
        8         9         9 
Lambda = 4e-04
Intercept        x1        x2 
        7         9         9 
Lambda = 6e-04
Intercept        x1        x2 
        6         9         9 
Lambda = 0.0013
Intercept        x1        x2 
        5         8         9 
Lambda = 0.0018
Intercept        x1        x2 
        5         8         8 
Lambda = 0.0025
Intercept        x1        x2 
        4         8         8 
Lambda = 0.0028
Intercept        x1        x2 
        4         7         8 
Lambda = 0.0045
Intercept        x1        x2 
        4         7         7 
Lambda = 0.0071
Intercept        x1        x2 
        3         7         7 
Lambda = 0.0079
Intercept        x1        x2 
        3         6         7 
Lambda = 0.01
Intercept        x1        x2 
        3         5         6 
Lambda = 0.0316
Intercept        x1        x2 
        3         5         5 
Lambda = 0.0398
Intercept        x1        x2 
        3         4         5 
Lambda = 0.0562
Intercept        x1        x2 
        3         4         4 
Lambda = 0.0708
Intercept        x1        x2 
        2         4         4 
Lambda = 0.0794
Intercept        x1        x2 
        1         4         4 
Lambda = 0.0891
Intercept        x1        x2 
        1         2         4 
Lambda = 0.1778
Intercept        x1        x2 
        1         2         3 
Lambda = 3.9811
Intercept        x1        x2 
        1         2         2 
Lambda = 14.1254
Intercept        x1        x2 
        1         2         1 
Lambda = 35.4813
Intercept        x1        x2 
        1         1         1 
************************** verbose (end) **************************
Press [enter] to continue
Press [enter] to continue
Press [enter] to continue
Press [enter] to continue
$family
[1] "gaussian"

$criterion
[1] "EBIC"

$alpha
[1] 1

$if.fuse
Intercept        x1        x2 
        1         1         1 

$betahat
          Intercept        x1        x2
study1  0.003676365 0.0000000 0.0000000
study2  0.003676365 0.0000000 0.0000000
study3  0.003676365 0.0000000 0.0000000
study4  0.003676365 0.0000000 0.0000000
study5  0.003676365 0.0000000 0.5121932
study6  0.003676365 0.9632209 0.5121932
study7  0.003676365 0.9632209 0.5121932
study8  0.003676365 0.9632209 1.0072402
study9  0.003676365 0.9632209 1.0072402
study10 0.003676365 0.9632209 1.0072402

$betainfo
$betainfo$original_scale
             Intercept         x1         x2
DF          1.00000000  2.0000000  3.0000000
lambda_opt  0.17782794  0.1778279  0.1778279
lambda_fuse 0.07943282 35.4813389 14.1253754
friction    0.00000000  0.9949881  0.9874107

$betainfo$log10_scale
             Intercept         x1         x2
DF          1.00000000 2.00000000 3.00000000
lambda_opt  0.07108185 0.07108185 0.07108185
lambda_fuse 0.03319562 1.56207077 1.17970616
friction    0.00000000 0.95449511 0.93974614

metafuse documentation built on May 2, 2019, 2:15 p.m.