rhat_ptb: Semi-supervised correlation estimation

Description Usage Arguments Details Value See Also

Description

This function estimates the correlation between an outcome available only for a small subset of the data and a covariate. The outcome is imputed to all the data using a smoothed predictor learned thanks to a set of surrogate variables, available for all the data.

Usage

1
2
3
4
rhat_ptb(data, nn, outcome_name = NULL, covariate_name = NULL,
  surrogate_name = NULL, bw, cdf_trans = TRUE, weights = NULL,
  ptb_beta = TRUE, adjust_covariates_name = NULL, do_interact = TRUE,
  X = NULL)

Arguments

data

the data. The first nn rows should be the labeled data, the remaining rows should be the unlabeled data.

nn

the number of labeled data

outcome_name

a character string containing the name of the column from data containing the partly missing outcome of interest

covariate_name

a character string containing the name of the column from data containing the covariate to be related to the outcome of interest

surrogate_name

a character string vector containing the name of the column(s) from data containing the surrogate variable(s)

bw

the bandwidth to use

cdf_trans

a logical flag indicating wether the smoothing should be performed on the data transformed with their cdf. Default is TRUE. See Details.

weights

a vector of weights in case a weighted version of the correlation has to be computed. Default is NULL, in which case, no additional weighting is done and regular perturbation is performed.

ptb_beta

logical flag indicating whether beta coefficient should be perturbed. Dafault is TRUE.

adjust_covariates_name

optional vector of names of the covariates to adjust on during imputation and smoothing. Default is NULL.

do_interact

logical flag indicating whether interactins between x and covariates should be taken into account when imputing y. Default is FALSE.

X

perturbation index needed for sapply call. This is a purely artificial argument and is never used in the functions (only needed for sapply to work). Default is NULL.

Details

Smoothing over the CDF transformed data prevents some tail estimation issues when the new data are subsequently large.

Value

a list with the following elements:

See Also

smooth_ssl rhat


stepcie/sslcov documentation built on May 30, 2019, 2:39 p.m.