fit_copula_OrdCont: Fit ordinal-continuous vine copula model

View source: R/fit_model_OrdCont_copula.R

fit_copula_OrdContR Documentation

Fit ordinal-continuous vine copula model

Description

fit_copula_OrdCont() fits the ordinal-continuous vine copula model. See Details for more information about this model.

Usage

fit_copula_OrdCont(
  data,
  copula_family,
  marginal_S0,
  marginal_S1,
  K_T,
  start_copula,
  method = "BFGS",
  ...
)

Arguments

data

data frame with three columns in the following order: surrogate endpoint, true endpoint, and treatment indicator (0/1 coding). Ordinal endpoints should be integers starting from 1.

copula_family

One of the following parametric copula families: "clayton", "frank", "gaussian", or "gumbel". The first element in copula_family corresponds to the control group, the second to the experimental group.

marginal_S0, marginal_S1

List with the following three elements (in order):

  • Density function with first argument x and second argument para the parameter vector for this distribution.

  • Distribution function with first argument x and second argument para the parameter vector for this distribution.

  • Inverse distribution function with first argument p and second argument para the parameter vector for this distribution.

  • The number of elements in para.

  • A vector of starting values for para.

K_T

Number of categories in the true endpoint.

start_copula

Starting value for the copula parameter.

method

Optimization algorithm for maximizing the objective function. For all options, see ?maxLik::maxLik. Defaults to "BFGS".

...

Arguments passed on to fit_copula_submodel_OrdCont

names_XY

Names for X and Y, respectively.

twostep

(boolean) If TRUE, the starting values are fixed for the marginal distributions and only the copula parameter is estimated.

start_Y

Starting values for the marginal distribution paramters for Y.

X

First variable (Ordinal with K categories)

Y

Second variable (Continuous)

K

Number of categories in X.

marginal_Y

List with the following five elements (in order):

  • Density function with first argument x and second argument para the parameter vector for this distribution.

  • Distribution function with first argument x and second argument para.

  • Inverse distribution function with first argument p and second argument para.

  • The number of elements in para.

  • Starting values for para.

Details

Vine Copula Model for Ordinal Endpoints

Following the Neyman-Rubin potential outcomes framework, we assume that each patient has four potential outcomes, two for each arm, represented by \boldsymbol{Y} = (T_0, S_0, S_1, T_1)'. Here, \boldsymbol{Y_z} = (S_z, T_z)' are the potential surrogate and true endpoints under treatment Z = z. We will further assume that T is ordinal and S is continuous; consequently, the function argument X corresponds to T and Y to S. (The roles of S and T can be interchanged without loss of generality.)

We introduce latent variables to model \boldsymbol{Y}. Latent variables will be denoted by a tilde. For instance, if T_z is ordinal with K_T categories, then T_z is a function of the latent \tilde{T}_z \sim N(0, 1) as follows:

T_z = g_{T_z}(\tilde{T}_z; \boldsymbol{c}^{T_z}) = \begin{cases} 1 & \text{ if } -\infty = c_0^{T_z} < \tilde{T_z} \le c_1^{T_z} \\ \vdots \\ k & \text{ if } c_{k - 1}^{T_z} < \tilde{T_z} \le c_k^{T_z} \\ \vdots \\ K & \text{ if } c_{K_{T} - 1}^{T_z} < \tilde{T_z} \le c_{K_{T}}^{T_z} = \infty, \\ \end{cases}

where \boldsymbol{c}^{T_z} = (c_1^{T_z}, \cdots, c_{K_T - 1}^{T_z}). The latent counterpart of \boldsymbol{Y} is again denoted by a tilde; for example, \tilde{\boldsymbol{Y}} = (\tilde{T}_0, S_0, S_1, \tilde{T}_1)' if T_z is ordinal and S_z is continuous.

The vector of latent potential outcome \tilde{\boldsymbol{Y}} is modeled with a D-vine copula as follows:

f_{\tilde{\boldsymbol{Y}}} = f_{\tilde{T}_0} \, f_{S_0} \, f_{S_1} \, f_{\tilde{T}_1} \cdot c_{\tilde{T}_0, S_0 } \, c_{S_0, S_1} \, c_{S_1, \tilde{T}_1} \cdot c_{\tilde{T}_0, S_1; S_0} \, c_{S_0, \tilde{T}_1; S_1} \cdot c_{\tilde{T}_0, \tilde{T}_1; S_0, S_1},

where (i) f_{T_0}, f_{S_0}, f_{S_1}, and f_{T_1} are univariate density functions, (ii) c_{T_0, S_0}, c_{S_0, S_1}, and c_{S_1, T_1} are unconditional bivariate copula densities, and (iii) c_{T_0, S_1; S_0}, c_{S_0, T_1; S_1}, and c_{T_0, T_1; S_0, S_1} are conditional bivariate copula densities (e.g., c_{T_0, S_1; S_0} is the copula density of (T_0, S_1)' \mid S_0. We also make the simplifying assumption for all copulas.

Observed-Data Likelihood

In practice, we only observe (S_0, T_0)' or (S_1, T_1)'. Hence, to estimate the (identifiable) parameters of the D-vine copula model, we need to derive the observed-data likelihood. The observed-data loglikelihood for (S_z, T_z)' is as follows:

f_{\boldsymbol{Y_z}}(s, t; \boldsymbol{\beta}) = \int_{c^{T_z}_{t - 1}}^{+ \infty} f_{\boldsymbol{\tilde{Y}_z}}(s, x; \boldsymbol{\beta}) \, dx - \int_{c^{T_z}_{t}}^{+ \infty} f_{\boldsymbol{\tilde{Y}_z}}(s, x; \boldsymbol{\beta}) \, dx.

The above expression is used in ordinal_continuous_loglik() to compute the loglikelihood for the observed values for Z = 0 or Z = 1. In this function, X and Y correspond to T_z and S_z if T_z is ordinal and S_z continuous. Otherwise, X and Y correspond to S_z and T_z.

Value

Returns an S3 object that can be used to perform the sensitivity analysis with sensitivity_analysis_copula().

Author(s)

Florian Stijven

See Also

sensitivity_analysis_copula(), print.vine_copula_fit(), plot.vine_copula_fit()


Surrogate documentation built on April 11, 2025, 6:09 p.m.