rsdr: Fit dimensional reduction models with resampling

Description Usage Arguments Value Examples

View source: R/rsdr-function.R

Description

This function fits a dimensional reduction model with resampling. Dataset is resampled a number of times. A model using the same dimensional reduction algorithm is fitted each time. These models can be used to transform dataset into new dimensions using an estimated weight per a pair of input and output dimension. The output dimensions are sorted from the highest to the lowest proportion of variance explained (PVE). Thus, one can choose dimensions with top PVE as the feature candidates for developing a prediction model.

Usage

1
2
3
4
5
6
7
8
9
rsdr(
  data,
  rs_method = c("BS", "CV"),
  rs_number = c(30, 10),
  dr_method = c("PCA", "SVD"),
  sd_cutoff = 0,
  state = 33,
  cl = 1
)

Arguments

data

Input data, a data frame with rows of samples and columns of variables that will be transformed.

rs_method

Resampling method, a character of BS for bootstrapping or CV for k-fold cross-validation.

rs_number

Resampling time/fold, an integer of any number. A common number for bootstrapping and cross-validation are 30 and 10, respectively.

dr_method

Dimensional reduction method, a character of PCA for principal componen analysis or SVD for singular value decomposition.

sd_cutoff

Standard deviation cutoff, a non-negative numeric of which a variable is excluded if the standard deviation is equal to this number or lower. This number is conceivably 0 if all values in a variable are the same. This situation (i.e. zero variance) is not allowed for dimensional reduction.

state

An integer to set random seed for reproducible results.

cl

Parallel cluster, a non-negative integer of number of CPU cluster used for computation in parallel. Set to 1 if no parallelism is expected.

Value

RSDR object, a list of results and parameters. Use plot() to visualize weights that are used to transformed all input dimension to each output dimension.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
## Create input example
library(medhist)
data(medhistdata)
ps_remover=extract_nps_mh(medhistdata)

mh_bin_nps=
  medhistdata[ps_remover_train$key,] %>%
  `exprs<-`(
    exprs(.) %>%
      t() %>%
      as.data.frame() %>%
      rownames_to_column(var='id') %>%
      column_to_rownames(var='id') %>%
      t()
  ) %>%
  trans_binary(verbose=F)
 
input=
  mh_bin_nps %>%
  exprs() %>%
  t() %>%
  as.data.frame()
 
## Fit dimensional reduction models with resampling
rsdr_bin_nps=rsdr(input,'CV',10,'PCA')

## Show fitting results
rsdr_bin_nps

## Plot weights to transform dimensions
plot(rsdr_bin_nps_train)

herdiantrisufriyana/rsdr documentation built on Feb. 15, 2021, 7:55 p.m.