Description Usage Format Details Methods Active Bindings

This R6 class defines fields and methods that controls all the parameters for non-parametric
modeling and estimation of multivariate joint conditional probability model `P(sA|sW)`

for summary measures `(sA,sW)`

.
Note that `sA`

can be multivariate and any component of `sA[j]`

can be either binary, categorical or continuous.
The joint probability for `P(sA|sA)`

= `P(sA[1],...,sA[k]|sA)`

is first factorized as
`P(sA[1]|sA)`

* `P(sA[2]|sA, sA[1])`

* ... * `P(sA[k]|sA, sA[1],...,sA[k-1])`

,
where each of these conditional probability models is defined by a new instance of a `SummariesModel`

class
(and a corresponding instance of the `RegressionClass`

class).
If `sA[j]`

is binary, the conditional probability `P(sA[j]|sW,sA[1],...,sA[j-1])`

is evaluated via logistic regression model.
When `sA[j]`

is continuous (or categorical), its estimation will be controlled by a new instance of
the `ContinSummaryModel`

class (or the `CategorSummaryModel`

class), as well as the accompanying new instance of the
`RegressionClass`

class. The range of continuous `sA[j]`

will be fist partitioned into `K`

bins and the corresponding `K`

bin indicators (`B_1,...,B_K`

), with `K`

new instances of `SummariesModel`

class, each instance defining a
single logistic regression model for one binary bin indicator outcome `B_j`

and predictors (`sW, sA[1],...,sA[k-1]`

).
Thus, the first instance of `RegressionClass`

and `SummariesModel`

classes will automatically
spawn recursive calls to new instances of these classes until the entire tree of binary logistic regressions that defines
the joint probability `P(sA|sW)`

is build.

1 |

An `R6Class`

generator object

`outvar.class`

- Character vector indicating a class of each outcome var:`bin`

/`cont`

/`cat`

.`outvar`

- Character vector of regression outcome variable names.`predvars`

- Either a pool of all character predictors (`sW`

) or regression-specific predictor names.reg_hazard - Logical, if TRUE, the joint probability model P(outvar | predvars) is factorized as \prod_jP(outvar[j] | predvars) for each j outvar (for fitting hazard).

`subset`

- Subset expression (later evaluated to logical vector in the envir of the data).`ReplMisVal0`

- Logical, if TRUE all gvars$misval among predicators are replaced with with gvars$misXreplace (0).`nbins`

- Integer number of bins used for a continuous outvar, the intervals are defined inside`ContinSummaryModel$new()`

and then saved in this field.`bin_nms`

- Character vector of column names for bin indicators.`useglm`

- Logical, if TRUE then fit the logistic regression model using`glm.fit`

, if FALSE use`speedglm.wfit`

..`parfit`

- Logical, if TRUE then use parallel`foreach::foreach`

loop to fit and predict binary logistic regressions (requires registering back-end cluster prior to calling the fit/predict functions)..`bin_bymass`

- Logical, for continuous outvar, create bin cutoffs based on equal mass distribution.`bin_bydhist`

- Logical, if TRUE, use dhist approach for bin definitions. See Denby and Mallows "Variations on the Histogram" (2009)) for more..`max_nperbin`

- Integer, maximum number of observations allowed per one bin.`pool_cont`

- Logical, pool binned continuous outvar observations across bins and only fit only regression model across all bins (adding bin_ID as an extra covaraite)..`outvars_to_pool`

- Character vector of names of the binned continuous outvars, should match`bin_nms`

.`intrvls.width`

- Named numeric vector of bin-widths (`bw_j : j=1,...,M`

) for each each bin in`self$intrvls`

. When`sA`

is not continuous,`intrvls.width`

IS SET TO 1. When sA is continuous and this variable`intrvls.width`

is not here, the intervals are determined inside`ContinSummaryModel$new()`

and are assigned to this variable as a list, with`names(intrvls.width) <- reg$bin_nms`

. Can be queried by`BinOutModel$predictAeqa()`

as:`intrvls.width[outvar]`

.`intrvls`

- Numeric vector of cutoffs defining the bins or a named list of numeric intervals for`length(self$outvar) > 1`

.`cat.levels`

- Numeric vector of all unique values in categorical outcome variable. Set by`CategorSummaryModel`

constructor.

`new(outvar.class = gvars$sVartypes$bin, outvar, predvars, subset, intrvls, ReplMisVal0 = TRUE, useglm = getopt("useglm"), parfit = getopt("parfit"), nbins = getopt("nbins"), bin_bymass = getopt("bin.method") bin_bydhist = getopt("bin.method") max_nperbin = getopt("maxNperBin"), pool_cont = getopt("poolContinVar")`

Uses the arguments to instantiate an object of R6 class and define the future regression model.

`ChangeManyToOneRegresssion(k_i, reg)`

Take a clone of a parent

`RegressionClass`

(`reg`

) for`length(self$outvar)`

regressions and set self to a single univariate`k_i`

regression for outcome`self$outvar[[k_i]]`

.`ChangeOneToManyRegresssions(regs_list)`

Take the clone of a parent

`RegressionClass`

for univariate (continuous outvar) regression and set self to`length(regs_list)`

bin indicator outcome regressions.`resetS3class()`

...

`S3class`

...

`get.reg`

...

tmlenet documentation built on May 29, 2017, 2:22 p.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.