| ipd | R Documentation |
The main wrapper function to conduct ipd using various methods and models, and returns a list of fitted model components.
ipd(
formula,
method,
model,
data,
label = NULL,
unlabeled_data = NULL,
intercept = TRUE,
alpha = 0.05,
alternative = "two-sided",
na_action = "na.fail",
...
)
formula |
An object of class |
method |
The IPD method to be used for fitting the model. Must be one
of |
model |
The type of downstream inferential model to be fitted, or the
parameter being estimated. Must be one of |
data |
A |
label |
A |
unlabeled_data |
(optional) A |
intercept |
|
alpha |
The significance level for confidence intervals. Default is
|
alternative |
A string specifying the alternative hypothesis. Must be
one of |
na_action |
(string, optional) How missing covariate data should be
handled. Currently |
... |
Additional arguments to be passed to the fitting function. See
the |
1. Formula:
The ipd function uses one formula argument that specifies both the
calibrating model (e.g., PostPI "relationship model", PPI "rectifier" model)
and the inferential model. These separate models will be created internally
based on the specific method called.
2. Data:
The data can be specified in two ways:
Single data argument (data) containing a stacked
data.frame and a label identifier (label).
Two data arguments, one for the labeled data (data) and one
for the unlabeled data (unlabeled_data).
For option (1), provide one data argument (data) which contains a
stacked data.frame with both the unlabeled and labeled data and a
label argument that specifies the column identifying the labeled
versus the unlabeled observations in the stacked data.frame (e.g.,
label = "set_label" if the column "set_label" in the stacked data
denotes which set an observation belongs to).
NOTE: Labeled data identifiers can be:
"l", "lab", "label", "labeled", "labelled", "tst", "test", "true"
TRUE
Non-reference category (i.e., binary 1)
Unlabeled data identifiers can be:
"u", "unlab", "unlabeled", "unlabelled", "val", "validation", "false"
FALSE
Non-reference category (i.e., binary 0)
For option (2), provide separate data arguments for the labeled data set
(data) and the unlabeled data set (unlabeled_data). If the
second argument is provided, the function ignores the label
identifier and assumes the data provided are not stacked.
NOTE: Not all columns in data or unlabeled_data may be used
unless explicitly referenced in the formula argument or in the
label argument (if the data are passed as one stacked data frame).
3. Method:
Use the method argument to specify the fitting method:
Gronsbell et al. (2026) Chen and Chen Correction
Gan et al. (2024) Prediction Decorrelated Inference
Wang et al. (2020) Post-Prediction Inference (PostPI) Analytic Correction
Wang et al. (2020) Post-Prediction Inference (PostPI) Bootstrap Correction
Angelopoulos et al. (2023) Prediction-Powered Inference (PPI)
Gronsbell et al. (2025) PPI "All" Correction
Angelopoulos et al. (2023) PPI++
Miao et al. (2023) Assumption-Lean and Data-Adaptive Post-Prediction Inference (PSPA)
4. Model:
Use the model argument to specify the type of downstream inferential
model or parameter to be estimated:
Mean value of a continuous outcome
qth quantile of a continuous outcome
Linear regression coefficients for a continuous outcome
Logistic regression coefficients for a binary outcome
Poisson regression coefficients for a count outcome
The ipd wrapper function will concatenate the method and
model arguments to identify the required helper function, following
the naming convention "method_model".
5. Auxiliary Arguments:
The wrapper function will take method-specific auxiliary arguments (e.g.,
q for the quantile estimation models) and pass them to the helper
function through the "..." with specified defaults for simplicity.
6. Other Arguments:
All other arguments that relate to all methods (e.g., alpha, ci.type), or other method-specific arguments, will have defaults.
a summary of model output.
An S4 object of class IPD with the following slots:
coefficientsNamed numeric
vector of estimated parameters.
seNamed numeric
vector of standard errors.
ciA matrix of confidence intervals,
with columns lower and upper.
coefTableA data.frame summarizing
Estimate, Std. Error, z-value, and Pr(>|z|) (glm-style).
fitThe raw output list returned by
the method-specific helper function.
formulaThe formula used for fitting
the IPD model.
data_lThe labeled data.frame used in
the analysis.
data_uThe unlabeled data.frame used
in the analysis.
methodA character string indicating
which IPD method was applied.
modelA character string indicating
the downstream inferential model.
interceptA logical indicating whether
an intercept was included.
#-- Generate Example Data
dat <- simdat(n = c(300, 300, 300), effect = 1, sigma_Y = 1)
head(dat)
formula <- Y - f ~ X1
#-- Chen and Chen Correction (Gronsbell et al., 2026)
ipd(formula,
method = "chen", model = "ols",
data = dat, label = "set_label"
)
#-- Prediction Decorrelated Inference (Gan et al., 2024)
ipd(formula,
method = "chen", model = "ols",
data = dat, label = "set_label"
)
#-- PostPI Analytic Correction (Wang et al., 2020)
ipd(formula,
method = "postpi_analytic", model = "ols",
data = dat, label = "set_label"
)
#-- PostPI Bootstrap Correction (Wang et al., 2020)
nboot <- 200
ipd(formula,
method = "postpi_boot", model = "ols",
data = dat, label = "set_label", nboot = nboot
)
#-- PPI (Angelopoulos et al., 2023)
ipd(formula,
method = "ppi", model = "ols",
data = dat, label = "set_label"
)
#-- PPI "All" (Gronsbell et al., 2025)
ipd(formula,
method = "ppi_a", model = "ols",
data = dat, label = "set_label"
)
#-- PPI++ (Angelopoulos et al., 2023)
ipd(formula,
method = "ppi_plusplus", model = "ols",
data = dat, label = "set_label"
)
#-- PSPA (Miao et al., 2023)
ipd(formula,
method = "pspa", model = "ols",
data = dat, label = "set_label"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.