Description Usage Format Details Methods Active Bindings See Also Examples
BinaryOutModel
can store and manage the (binarize/ discretized) design matrix Xmat and the outcome Bin for the binary regression
P(Bin|Xmat). It provides argument self$estimator
to include different candidate estimators in the fitting and predicting library,
such as data-adaptive super learner algorithms and parametric logistic regression. When fitting one pooled regression across multiple
bins, it provides method to convert data from wide to long format when requested (to gain computational efficiency).
1 |
An R6Class
generator object
bin_names - Character vector of names of the bins.
ID - Integer vector of observation IDs used for pooling. 1:n
.
pooled_bin_name - Original name of the continuous covariate that was discretized into bins and then pooled.
nbins - Number of bins used for estimation of a continuous outvar, defined in ContinModel$new().
estimator - Character, one of "speedglm__glm" (default), "glm__glm", "h2o__ensemble", "SuperLearner".
outvar - Character, outcome name.
predvars - Character vector of predictor names.
cont.sVar.flag - Logical. If TRUE, indicate the original outcome variable is continuous.
bw.j - Bin width of a bin indicator obtained from the discretization of a continous covariate.
is.fitted - Logical. If TRUE, indicate the BinaryOutModel
class object is fitted already.
pool_cont - Logical. If TRUE, perform pooling of bins.
outvars_to_pool - Character vector of outcome bin names for pooling.
ReplMisVal0 - Logical. If TRUE, user-supplied gvars$misXreplace (Default to 0) will be used to replace all gvars$misval
among predictors. ReplMisVal0
in RegressionClass
will be used when instantiating an new object of BinaryOutModel
.
n - Number of rows in the input data.
subset_expr - Vector of length n
that specifies a subset of data to be used in the fitting process.
Either logical, expression or indices.
subset_idx - Logical version of subset_expr
.
new(reg)
Use reg
(a RegressionClass
class object) to instantiate an new object of BinaryOutModel
for a single binary regression.
newdata(newdata, getoutvar = TRUE, ...)
Evaluate subset and perform correct subseting of data to construct X_mat, Yvals & wt_vals.
define.subset.idx(data)
Create a logical vector which is converted from subset_expr
fit(overwrite = FALSE, data, predict = FALSE, savespace = TRUE, ...)
fit a binary regression. Note that overwrite
is
Logical. If FALSE
(Default), the previous fitted model cannot be overwritten by new fitting model. savespace
is Logical.
If TRUE
(Default), wipe out all internal data when doing many stacked regressions.
copy.fit(bin.out.model)
Take fitted BinaryOutModel object as an input and save the fit to itself.
predict(newdata, savespace = TRUE, ...)
Predict the response P(A = 1|W = w, E = e).
copy.predict(bin.out.model)
Tke BinaryOutModel object that contains the predictions for P(A=1|w,e) and save to itself
predictAeqa(newdata, bw.j.sA_diff, savespace = TRUE, wipeProb = TRUE)
Predict the response P(A = a|W = w, E = e) for observed A, W, E. Note that wipeProb is logical argument for self$wipe.alldat. If FALSE, vectors of probA1 & probAeqa will be kept.
show()
Print regression formula, including outcome and predictor names.
wipe.alldat(wipeProb = TRUE)
...
getfit
...
getprobA1
...
getprobAeqa
...
emptydata
...
emptyY
...
emptyWeight
...
emptySubset_idx
...
getXmat
...
getY
...
getWeight
...
DatKeepClass
, RegressionClass
, tmleCom_Options
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | ## Not run:
#***************************************************************************************
# Example 1: Estimate a outcome regression directly through BinaryOutModel
data(indSample.iid.bA.bY.rareJ2_list)
indSample.iid.bA.bY.rareJ2 <- indSample.iid.bA.bY.rareJ2_list$indSample.iid.bA.bY.rareJ2
N <- nrow(indSample.iid.bA.bY.rareJ2)
# speed.glm to fit regressions (it's GLMs to medium-large datasets)
tmleCom_Options(Qestimator = "speedglm__glm", maxNperBin = N)
options(tmleCommunity.verbose = TRUE) # Print status messages
#***************************************************************************************
#***************************************************************************************
# 1.1 Specifying outcome and predictor variables for outcome mechanism
#***************************************************************************************
# Y depends on all its parent nodes (A, W1, W2, W3, W4)
Qform.all <- Y ~ W1 + W2 + W3 + W4 + A
Q.sVars1 <- tmleCommunity:::define_regform(regform = Qform.all)
# Equivalent way to define Q.sVars: use Anodes.lst (outcomes) & Wnodes.lst (predictors)
# node can only contain one or more of Ynode, Anodes, WEnodes, communityID and Crossnodes
nodes <- list(Ynode = "Y", Anodes = "A", WEnodes = c("W1", "W2", "W3", "W4"))
Q.sVars2 <- tmleCommunity:::define_regform(regform = NULL, Anodes.lst = nodes$Ynode,
Wnodes.lst = nodes[c("Anodes", "WEnodes")])
# Also allows to include interaction terms in regression formula (Correct Qform)
Qform.interact <- Y ~ W1 + W2*A + W3 + W4
Q.sVars3 <- tmleCommunity:::define_regform(regform = Qform.interact)
# Alternative way to define Qform.interact
Qform.interact2 <- Y ~ W1 + W2 + W3 + W4 + A + W2:A
Q.sVars4 <- tmleCommunity:::define_regform(regform = Qform.interact2)
#***************************************************************************************
# 1.2 Fit and predict a regression model for outcome mechanism Qbar(A, W)
#***************************************************************************************
# Create a new object of DatKeepClass that can store and munipulate the input data
OData_R6 <- DatKeepClass$new(Odata = indSample.iid.bA.bY.rareJ2,
nodes = nodes, norm.c.sVars = FALSE)
# Add a vector of observation (sampling) weights that encodes knowledge of rare outcome
OData_R6$addObsWeights(obs.wts = indSample.iid.bA.bY.rareJ2_list$obs.wt.J2)
# Create a new object of RegressionClass that defines regression models
# using misspecified Qform (without interaction term)
Qreg <- RegressionClass$new(outvar = Q.sVars1$outvars, predvars = Q.sVars1$predvars,
subset_vars = (!rep_len(FALSE, N)))
# Set savespace=FALSE to save all productions during fitting, including models and data
m.Q.init <- BinaryOutModel$new(reg = Qreg)$fit(data = OData_R6, savespace = FALSE)
length(m.Q.init$getY) # 3000, the outcomes haven't been erased since savespace = FALSE
head(m.Q.init$getXmat) # the predictor matrix is kept since savespace = FALSE
m.Q.init$getfit$coef # Provide cofficients from the fitting regression
m.Q.init$is.fitted # TRUE
# Now fit the same regression model but set savespace to TRUE (only fitted model left)
# Need to set overwrite to TRUE to avoid error when m.Q.init is already fitted
m.Q.init <- m.Q.init$fit(overwrite = TRUE, data = OData_R6, savespace = TRUE)
all(is.null(m.Q.init$getXmat), is.null(m.Q.init$getY)) # TRUE, all wiped out
# Set savespace = TRUE to wipe out any traces of saved data in predict step
m.Q.init$predict(newdata = OData_R6, savespace = TRUE)
is.null(m.Q.init$getXmat) # TRUE, the covariates matrix has been erased to save RAM space
mean(m.Q.init$getprobA1) # 0.02175083, bad estimate since misspecified Qform
#***************************************************************************************
# 1.3 Same as above but using Super Learner (data-adaptive algorithms)
#***************************************************************************************
# Specifying the SuperLearner library in tmleCom_Options()
library(SuperLearner)
tmleCom_Options(SL.library = c("SL.glm", "SL.randomForest"), maxNperBin = N)
# Instead of reinitiating a RegressionClass object, change estimator directly in Qreg
# so don't need to redefine Qestimator in tmleCom_Options()
Qreg$estimator <- "SuperLearner"
set.seed(12345)
m.Q.init <- BinaryOutModel$new(reg = Qreg)$fit(data = OData_R6, savespace = TRUE)
m.Q.init$predict(newdata = OData_R6, savespace = TRUE)
mean(m.Q.init$getprobA1)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.