masher_soft: Soft masher

Description Usage Arguments Value Examples

View source: R/mash.R

Description

If you use Rstudio, the masher and spicer functions can help remind you which parameters go along with which ipa_brew flavor. The basic idea is to write spice(brew, with = spicer_<flavor>()) and mash(brew, with = masher_<flavor>()). Hitting the tab key with your curser inside the parentheses of masher_flavor()will create a drop-down menu that shows a list of the arguments that go along with your brew's flavor.

If you have no trouble remembering the parameters that go along with your brew's flavor, or if you just want your code to be more concise, you don't have to use the with argument. Instead, you can just specify parameter values directly using the ... argument in the mash and spice functions. In the examples below, both approaches are shown.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
masher_soft(
  bs = TRUE,
  bs_maxit = 20,
  bs_thresh = 1e-09,
  bs_row.center = FALSE,
  bs_col.center = TRUE,
  bs_row.scale = FALSE,
  bs_col.scale = TRUE,
  si_type = "als",
  si_thresh = 1e-05,
  si_maxit = 100,
  si_final.svd = TRUE
)

Arguments

bs

a logical value. If TRUE, softImpute::biScale() is applied to data_ref or rbind(data_ref, data_new) prior to fitting softImpute models.

bs_maxit

an integer indicating the maximum number of iterations for the biScale algorithm.

bs_thresh

convergence threshold for the biScale algorithm.

bs_row.center

a logical value. If TRUE, row centering will be performed. If FALSE (default), then nothing is done.

bs_col.center

a logical value. If TRUE (default), column centering will be performed. If FALSE, then nothing is done.

bs_row.scale

a logical value. If TRUE, row scaling will be performed. If FALSE (default), then nothing is done.

bs_col.scale

a logical value. If TRUE (default), column scaling will be performed. If FALSE, then nothing is done.

si_type

two algorithms are implemented, type="svd" or the default type="als". The "svd" algorithm repeatedly computes the svd of the completed matrix, and soft thresholds its singular values. Each new soft-thresholded svd is used to re-impute the missing entries. For large matrices of class "Incomplete", the svd is achieved by an efficient form of alternating orthogonal ridge regression. The "als" algorithm uses this same alternating ridge regression, but updates the imputation at each step, leading to quite substantial speedups in some cases. The "als" approach does not currently have the same theoretical convergence guarantees as the "svd" approach.

si_thresh

convergence threshold for the softImpute algorithm, measured as the relative change in the Frobenius norm between two successive estimates.

si_maxit

maximum number of iterations for the softImpute algorithm.

si_final.svd

only applicable to si_type = "als". The alternating ridge-regressions do not lead to exact zeros. With the default final.svd = TRUE, at the final iteration, a one step unregularized iteration is performed, followed by soft-thresholding of the singular values, leading to hard zeros.

Value

a list with input values that can be passed directly into mash, e.g mash(brew, with = masher_nbrs()) for a neighbors brew or mash(brew, with = masher_soft()) for a soft brew.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
x1 = rnorm(100)
x2 = rnorm(100) + x1
x3 = rnorm(100) + x1 + x2

outcome = 0.5 * (x1 - x2 + x3)

data <- data.frame(x1=x1, x2=x2, x3=x3, outcome=outcome)

n_miss = 10

data[1:n_miss,'x1'] = NA
sft_brew <- brew_soft(data, outcome=outcome, bind_miss = FALSE)

# these two calls are equivalent
mash(sft_brew, with = masher_soft(bs = FALSE))
mash(sft_brew, bs = FALSE)

knn_brew <- brew_nbrs(data, outcome=outcome, bind_miss = TRUE) %>%

# these two calls are equivalent
mash(knn_brew, with = masher_nbrs(fun_aggr_ctns = median))
mash(knn_brew, fun_aggr_ctns = median)

bcjaeger/ipa documentation built on May 7, 2020, 9:45 a.m.