r_prepare_data: Prepare data for regression routines

Description Usage Arguments Value Examples

Description

This function will output the appropriate X and Y matrices in the right format for regression packages such as mgcv, caret and glmnet

Usage

1
r_prepare_data(data, response = "Y", exposure = "E", probe_names)

Arguments

data

the data frame which contains the response, exposure, and genes or cpgs or covariates. the columns should be labelled.

response

the column name of the response in the data argument

exposure

the column name of the exposure in the data argument

probe_names

the column names of the genes, or cpg sites or covariates

Value

a list of length 5:

X

the X matrix

Y

the response vector

E

the exposure vector

main_effect_names

the names of the main effects including the exposure

interaction_names

the names of the interaction effects

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
data("tcgaov")
tcgaov[1:5,1:6, with = FALSE]
Y <- log(tcgaov[["OS"]])
E <- tcgaov[["E"]]
genes <- as.matrix(tcgaov[,-c("OS","rn","subtype","E","status"),with = FALSE])
trainIndex <- drop(caret::createDataPartition(Y, p = 0.5, list = FALSE, times = 1))
testIndex <- setdiff(seq_len(length(Y)),trainIndex)

## Not run: 
cluster_res <- r_cluster_data(data = genes,
                              response = Y,
                              exposure = E,
                              train_index = trainIndex,
                              test_index = testIndex,
                              cluster_distance = "tom",
                              eclust_distance = "difftom",
                              measure_distance = "euclidean",
                              clustMethod = "hclust",
                              cutMethod = "dynamic",
                              method = "average",
                              nPC = 1,
                              minimum_cluster_size = 50)

pc_eclust_interaction <- r_prepare_data(data = cbind(cluster_res$clustersAddon$PC,
                                                     survival = Y[trainIndex],
                                                     subtype = E[trainIndex]),
                                        response = "survival", exposure = "subtype")
names(pc_eclust_interaction)
dim(pc_eclust_interaction$X)
pc_eclust_interaction$main_effect_names
pc_eclust_interaction$interaction_names

## End(Not run)

eclust documentation built on May 1, 2019, 8:46 p.m.