sse: Trained-on-Observed-Studies Ensemble (Study-Specific...

Description Usage Arguments Value Examples

View source: R/SSE.caret.mSSL.R

Description

Trained-on-Observed-Studies Ensemble (Study-Specific Ensemble) for Multi-Study Learning: fits one or more models on each study and ensembles models.

Usage

1
2
3
4
sse(formula = Y ~ ., data, target.study = NA, sim.covs = NA,
  ssl.method = list("lm"), ssl.tuneGrid = list(c()),
  sim.mets = FALSE, model = FALSE, customFNs = list(),
  stack.standardize = FALSE)

Arguments

formula

Model formula

data

A dataframe with all the studies has the following columns in this order: "Study", "Y", "V1", ...., "Vp"

target.study

Dataframe of the design matrix (just covariates) of study one aims to make predictions on

sim.covs

Is a vector of names of covariates or the column numbers of the covariates to be used for the similarity measure. Default is to use all covariates.

ssl.method

A list of strings indicating which modeling methods to use.

ssl.tuneGrid

A list of the tuning parameters in the format of the caret package. Each element must be a dataframe (as required by caret). If no tuning parameters are required then NA is indicated.

sim.mets

Boolean indicating whether to calculate default covariate profile similarity measures.

model

Indicates whether to attach training data to model object.

customFNs

Optional list of functions that can be used to add custom covaraite profile similarity measures.

stack.standardize

Boolean determining whether stacking weights are standardized to sum to 1. Default is FALSE

Value

A model object of studyStrap class "ss" that can be used to make predictions.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
##########################
##### Simulate Data ######
##########################

set.seed(1)
# create half of training dataset from 1 distribution
X1 <- matrix(rnorm(2000), ncol = 2) # design matrix - 2 covariates
B1 <- c(5, 10, 15) # true beta coefficients
y1 <- cbind(1, X1) %*% B1

# create 2nd half of training dataset from another distribution
X2 <- matrix(rnorm(2000, 1,2), ncol = 2) # design matrix - 2 covariates
B2 <- c(10, 5, 0) # true beta coefficients
y2 <- cbind(1, X2) %*% B2

X <- rbind(X1, X2)
y <- c(y1, y2)

study <- sample.int(10, 2000, replace = TRUE) # 10 studies
data <- data.frame( Study = study, Y = y, V1 = X[,1], V2 = X[,2] )

# create target study design matrix for covariate profile similarity weighting and
# accept/reject algorithm (covaraite-matched study strap)
target <- matrix(rnorm(1000, 3, 5), ncol = 2) # design matrix
colnames(target) <- c("V1", "V2")

##########################
##### Model Fitting #####
##########################

sseMod <- sse(formula = Y ~.,
             data = data,
             ssl.method = list("pcr"),
             ssl.tuneGrid = list(data.frame("ncomp" = 1)),
             model = FALSE,
             customFNs = list() )


## Fit models with Target Study Specified ##

# Fit model with 1 Single-Study Learner (SSL): Linear Regression
sseMod1 <- sse(formula = Y ~.,
             data = data,
             target.study = target,
             ssl.method = list("lm"),
             ssl.tuneGrid = list(NA),
             sim.mets = FALSE,
             model = FALSE,
             customFNs = list() )

# Fit model with 2 SSLs: Linear Regression and PCA Regression
sseMod2 <- sse(formula = Y ~.,
             data = data,
             target.study = target,
             ssl.method = list("lm", "pcr"),
             ssl.tuneGrid = list(NA,
                             data.frame("ncomp" = 1)),
             sim.mets = TRUE,
             model = FALSE,
             customFNs = list() )



# Fit model with custom similarity function for
# covaraite profile similarity weighting

fn1 <- function(x1,x2){
return( abs( cor( colMeans(x1), colMeans(x2) )) )
}

sseMod3 <- sse(formula = Y ~.,
             data = data,
             target.study = target,
             ssl.method = list("lm", "pcr"),
             ssl.tuneGrid = list(NA,
                             data.frame("ncomp" = 1)),
             sim.mets = TRUE,
             model = FALSE,
             customFNs = list(fn1) )

#########################
#####  Predictions ######
#########################

preds <- studyStrap.predict(sseMod1, target)

studyStrap documentation built on Feb. 20, 2020, 5:08 p.m.