Description Usage Arguments Value Details References
Checks for common errors in user input, assembles data user provides and applies STRIDE estimators.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | stride.estimator.wrapper(
n,
m,
p,
qvs,
q,
x,
delta,
ww,
zz,
run.NPMLEs,
run.NPNA,
run.NPNA_avg,
run.NPNA_wrong,
run.OLS,
run.WLS,
run.EFF,
run.EMPAVA,
tval,
tval0,
z.use,
w.use,
update.qs = FALSE,
know.true.groups = FALSE,
true.group.identifier = NULL,
run.prediction.accuracy = FALSE,
do_cross_validation_AUC_BS = FALSE
)
|
n |
sample size, must be at least 1. |
m |
number of different mixture proportions, must be at least 2. |
p |
number of populations, must be at least 2. |
qvs |
a numeric matrix of size |
q |
a numeric matrix of size |
x |
a numeric vector of length |
delta |
a numeric vector of length |
ww |
a numeric vector of length |
zz |
a numeric vector of length |
run.NPMLEs |
a logical indicator. If TRUE, then the output includes the estimated distribution function for mixture data based on the type-I and type II nonparametric maximum likelihood estimators. The type I nonparametric maximum likelihood estimator is referred to as the "Kaplan-Meier" estimator in Garcia and Parast (2020). Neither the type I nor type II adjust for covariates. |
run.NPNA |
a logical indicator. If TRUE, then the output includes the estimated distribution function for mixture data that accounts for covariates and dynamic landmarking. This estimator is called "NPNA" in Garcia and Parast (2020). |
run.NPNA_avg |
a logical indicator. If TRUE, then the output includes the estimated distribution function for mixture data that averages out over the observed covariates. This is referred to as NPNA_marg in Garcia and Parast (2020). |
run.NPNA_wrong |
a logical indicator. If TRUE, then the output includes the estimated distribution function for mixture data that adjusts for covariates, but ignores landmarking. This is referred to as NPNA_t_0=0 in Garcia and Parast (2020). |
run.OLS |
a logical indicator. If TRUE, then the output includes the estimated distribution function computed using an ordinary least squares influence function. The estimator adjusts for censoring using inverse probability weighting (IPW), augmented inverse probability weighting (AIPW), and imputation (IMP). See details in Wang et al (2012). These estimators do not adjust for covariates. |
run.WLS |
a logical indicator. If TRUE, then the output includes the estimated distribution function computed using a weighted least squares influence function. The estimator adjusts for censoring using inverse probability weighting (IPW), augmented inverse probability weighting (AIPW), and imputation (IMP). See details in Wang et al (2012). These estimators do not adjust for covariates. |
run.EFF |
a logical indicator. If TRUE, then the output includes the estimated distribution function computed using the efficient influence function based on Hilbert space projection theory results. The estimator adjusts for censoring using inverse probability weighting (IPW), augmented inverse probability weighting (AIPW), and imputation (IMP). See details in Wang et al (2012). These estimators do not adjust for covariates. |
run.EMPAVA |
logical indicator. If TRUE, we compute the distribution function for the mixture data based on an expectation-maximization (EM) algorithm that uses the pool adjacent violators algorithm (PAVA) from isotone regression to yield a non-negative and monotone estimator. This estimator does not adjust for covariates. See details in Qing et al (2014). |
tval |
numeric vector of time points at which the distribution function is evaluated, all values must be non-negative. |
tval0 |
numeric vector of time points representing the landmark times. All values must be non-negative
and smaller than the maximum of |
z.use |
numeric vector at which to evaluate the discrete covariate Z at in the estimated distribution function.
The values of |
w.use |
numeric vector at which to evaluate the continuous covariate W at in the estimated distribution function.
The values of |
update.qs |
logical indicator. If TRUE, the mixture proportions |
know.true.groups |
logical indicator. If TRUE, then we know the population identifier for each person in the sample. This option is only used for simulation studies to check prediction accuracy. Default is FALSE. |
true.group.identifier |
numeric vector of length |
run.prediction.accuracy |
logical indicator. If TRUE, then we compute the prediction accuracy measures, including the
area under the receiver operating characteristic curve (AUC) and the Brier Score (BS). Prediction accuracy is only valid
in simulation studies where |
do_cross_validation_AUC_BS |
logical indicator. If TRUE, then we compute the prediction accuracy measures, including the
area under the receiver operating characteristic curve (AUC) and the Brier Score (BS) using cross-validation. Prediction accuracy is only valid
in simulation studies where |
stride.wrapper.estimator
returns a list containing
problem: a numeric indicator of errors in the NPNA, NPNA_avg, NPNA_wrong estimator. If NULL, no error is reported. Otherwise, there is an error in the computation of the NPNA, NPNA_avg, or NPNA_wrong estimator.
Ft.estimate: a numeric array containing the estimated distribution functions for all methods for all
p
populations. The distribution function is evaluated at each tval
,
tval0
, z.use
(if non-NULL), w.use
(if non-NULL), and for all p
populations.
The dimension of the array is \# of methods by length(tval)
by lenth(tval0)
by
length(z.use)
by length(w.use)
by p
. If z.use
and w.use
are NULL,
the dimension of the array is \# of methods by length(tval)
by lenth(tval0)
by p
. The distribution function is only valid for t≥q t_0, so
Ft.estimate
shows NA for any combination for which t<t_0.
St.estimate: a numeric array containing the estimated distribution functions for all methods
for all m
mixture proportion subgroups. The distribution function is evaluated
at each tval
, tval0
, z.use
(if non-NULL), w.use
(if non-NULL), and for all m
mixture
proportion subgroups.
The dimension of the array is \# of methods by length(tval)
by lenth(tval0)
by
length(z.use)
by length(w.use)
by m
. If z.use
and w.use
are NULL,
the dimension of the array is \# of methods by length(tval)
by lenth(tval0)
by m
. The distribution function is only valid for t≥q t_0, so
St.estimate
shows NA for any combination for which t<t_0.
Ft.AUC.BS: a numeric array containing the
area under the receiver operating characteristic curve (AUC) and
Brier Score (BS) for the p
populations. The dimension of the array is \# of methods by
length(tval)
by length(tval0)
by 2,
where the last dimension stores the AUC and BS results.
Results for both the estimated distributon functions and prediction accuracy measures (AUC, BS) are only valid when t≥q t_0, so arrays show NA for any combination for which t<t_0.
St.AUC.BS: a numeric array containing the results are the
area under the receiver operating characteristic curve (AUC) and
Brier Score (BS) for the m
mixture proportion groups.
The dimension of the array is \# of methods by
length(tval)
by length(tval0)
by 2,
where the last dimension stores the AUC and BS results.
Results for both the estimated distributon functions and prediction accuracy measures (AUC, BS) are only valid when t≥q t_0, so arrays show NA for any combination for which t<t_0.
We estimate nonparametric distribution functions for mixture data where
the population identifiers are unknown, and the probability of belonging
to a population is known (typically estimated with external data).
The distribution functions are evaluated at
time points tval
. All estimators adjust for dynamic landmark prediction.
Dynamic landmark prediction means that the distribution function is computed knowing
that the survival time, T, satisfies T >t_0
where t_0 are the time points in tval0
. The NPNA, NPNA_avg,
and NPNA_wrog adjust for one discrete covariate (zz
) and one continuous covariate (ww
).
Garcia, T.P. and Parast, L. (2020). Dynamic landmark prediction for mixture data. Biostatistics, doi:10.1093/biostatistics/kxz052.
Garcia, T.P., Marder, K. and Wang, Y. (2017). Statistical modeling of Huntington disease onset. In Handbook of Clinical Neurology, vol 144, 3rd Series, editors Andrew Feigin and Karen E. Anderson.
Qing, J., Garcia, T.P., Ma, Y., Tang, M.X., Marder, K., and Wang, Y. (2014). Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint. Annals of Applied Statistics, 8(2), 1182-1208.
Wang, Y., Garcia, T.P., and Ma. Y. (2012). Nonparametric estimation for censored mixture data with application to the Cooperative Huntington's Observational Research Trial. Journal of the American Statistical Association, 107, 1324-1338.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.