Description Usage Format Details Methods Active Bindings
This R6 class defines fields and methods that controls all the parameters for non-parametric
modeling and estimation of multivariate joint conditional probability model P(sA|sW)
for summary measures (sA,sW)
.
Note that sA
can be multivariate and any component of sA[j]
can be either binary, categorical or continuous.
The joint probability for P(sA|sA)
= P(sA[1],...,sA[k]|sA)
is first factorized as
P(sA[1]|sA)
* P(sA[2]|sA, sA[1])
* ... * P(sA[k]|sA, sA[1],...,sA[k-1])
,
where each of these conditional probability models is defined by a new instance of a SummariesModel
class
(and a corresponding instance of the RegressionClass
class).
If sA[j]
is binary, the conditional probability P(sA[j]|sW,sA[1],...,sA[j-1])
is evaluated via logistic regression model.
When sA[j]
is continuous (or categorical), its estimation will be controlled by a new instance of
the ContinSummaryModel
class (or the CategorSummaryModel
class), as well as the accompanying new instance of the
RegressionClass
class. The range of continuous sA[j]
will be fist partitioned into K
bins and the corresponding K
bin indicators (B_1,...,B_K
), with K
new instances of SummariesModel
class, each instance defining a
single logistic regression model for one binary bin indicator outcome B_j
and predictors (sW, sA[1],...,sA[k-1]
).
Thus, the first instance of RegressionClass
and SummariesModel
classes will automatically
spawn recursive calls to new instances of these classes until the entire tree of binary logistic regressions that defines
the joint probability P(sA|sW)
is build.
1 |
An R6Class
generator object
outvar.class
- Character vector indicating a class of each outcome var: bin
/ cont
/ cat
.
outvar
- Character vector of regression outcome variable names.
predvars
- Either a pool of all character predictors (sW
) or regression-specific predictor names.
reg_hazard - Logical, if TRUE, the joint probability model P(outvar | predvars) is factorized as \prod_jP(outvar[j] | predvars) for each j outvar (for fitting hazard).
subset
- Subset expression (later evaluated to logical vector in the envir of the data).
ReplMisVal0
- Logical, if TRUE all gvars$misval among predicators are replaced with with gvars$misXreplace (0).
nbins
- Integer number of bins used for a continuous outvar, the intervals are defined inside
ContinSummaryModel$new()
and then saved in this field.
bin_nms
- Character vector of column names for bin indicators.
useglm
- Logical, if TRUE then fit the logistic regression model using glm.fit
,
if FALSE use speedglm.wfit
..
parfit
- Logical, if TRUE then use parallel foreach::foreach
loop to fit and predict binary logistic
regressions (requires registering back-end cluster prior to calling the fit/predict functions)..
bin_bymass
- Logical, for continuous outvar, create bin cutoffs based on equal mass distribution.
bin_bydhist
- Logical, if TRUE, use dhist approach for bin definitions. See Denby and Mallows "Variations on the
Histogram" (2009)) for more..
max_nperbin
- Integer, maximum number of observations allowed per one bin.
pool_cont
- Logical, pool binned continuous outvar observations across bins and only fit only regression model
across all bins (adding bin_ID as an extra covaraite)..
outvars_to_pool
- Character vector of names of the binned continuous outvars, should match bin_nms
.
intrvls.width
- Named numeric vector of bin-widths (bw_j : j=1,...,M
) for each each bin in self$intrvls
.
When sA
is not continuous, intrvls.width
IS SET TO 1. When sA is continuous and this variable intrvls.width
is not here, the intervals are determined inside ContinSummaryModel$new()
and are assigned to this variable as a list,
with names(intrvls.width) <- reg$bin_nms
. Can be queried by BinOutModel$predictAeqa()
as: intrvls.width[outvar]
.
intrvls
- Numeric vector of cutoffs defining the bins or a named list of numeric intervals for length(self$outvar) > 1
.
cat.levels
- Numeric vector of all unique values in categorical outcome variable.
Set by CategorSummaryModel
constructor.
new(outvar.class = gvars$sVartypes$bin,
outvar, predvars, subset, intrvls,
ReplMisVal0 = TRUE,
useglm = getopt("useglm"),
parfit = getopt("parfit"),
nbins = getopt("nbins"),
bin_bymass = getopt("bin.method")
bin_bydhist = getopt("bin.method")
max_nperbin = getopt("maxNperBin"),
pool_cont = getopt("poolContinVar")
Uses the arguments to instantiate an object of R6 class and define the future regression model.
ChangeManyToOneRegresssion(k_i, reg)
Take a clone of a parent RegressionClass
(reg
) for length(self$outvar)
regressions
and set self to a single univariate k_i
regression for outcome self$outvar[[k_i]]
.
ChangeOneToManyRegresssions(regs_list)
Take the clone of a parent RegressionClass
for univariate (continuous outvar) regression
and set self to length(regs_list)
bin indicator outcome regressions.
resetS3class()
...
S3class
...
get.reg
...
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.