condensier_options: (RETIRED) Setting Options for 'condensier'

Description Usage Arguments Value See Also

Description

This function is now retired. Please use fit_density directly for tuning parameter set-up. Calling this function now will have no effect. Previously provided additional options that control the estimation algorithm in condensier package.

Usage

1
2
3
condensier_options(bin_estimator = speedglmR6$new(), parfit = FALSE,
  bin_method = c("equal.len", "equal.mass", "dhist"), nbins = NA,
  max_n_cat = 20, poolContinVar = FALSE, max_n_bin = 1000)

Arguments

bin_estimator

The estimator to use for fitting the binary outcomes (defaults to speedglmR6 which estimates with speedglmR6) another default option is glmR6.

parfit

Default is FALSE. Set to TRUE to use foreach package and its functions foreach and dopar to perform parallel logistic regression fits and predictions for discretized continuous outcomes. This functionality requires registering a parallel backend prior to running condensier function, e.g., using doParallel R package and running registerDoParallel(cores = ncores) for integer ncores parallel jobs. For an example, see a test in "./tests/RUnit/RUnit_tests_04_netcont_sA_tests.R".

bin_method

The method for choosing bins when discretizing and fitting the conditional continuous summary exposure variable sA. The default method is "equal.len", which partitions the range of sA into equal length nbins intervals. Method "equal.mass" results in a data-adaptive selection of the bins based on equal mass (equal number of observations), i.e., each bin is defined so that it contains an approximately the same number of observations across all bins. The maximum number of observations in each bin is controlled by parameter max_n_bin. Method "dhist" uses a mix of the above two approaches, see Denby and Mallows "Variations on the Histogram" (2009) for more detail.

nbins

Set the default number of bins when discretizing a continous outcome variable under setting bin_method = "equal.len". If left as NA the total number of equal intervals (bins) is determined by the nearest integer of nobs/max_n_bin, where nobs is the total number of observations in the input data.

max_n_cat

Max number of unique categories a categorical variable sA[j] can have. If sA[j] has more it is automatically considered continuous.

poolContinVar

Set to TRUE for fitting a pooled regression which pools bin indicators across all bins. When fitting a model for binirized continuous outcome, set to TRUE for pooling bin indicators across several bins into one outcome regression?

max_n_bin

Max number of observations per 1 bin for a continuous outcome (applies directly when bin_method="equal.mass" and indirectly when bin_method="equal.len", but nbins = NA).

Value

Invisibly returns a list with old option settings.

See Also

print_condensier_opts


osofr/condensier documentation built on May 8, 2019, 11:14 p.m.