R/findintercorr_cont_cat.R

Defines functions findintercorr_cont_cat

Documented in findintercorr_cont_cat

#' @title Calculate Intermediate MVN Correlation for Continuous - Ordinal Variables
#'
#' @description This function calculates a \code{k_cont x k_cat} intermediate matrix of correlations for the \code{k_cont} continuous and
#'     \code{k_cat} ordinal (r >= 2 categories) variables. It extends the method of Demirtas et al. (2012, \doi{10.1198/tast.2011.10090})
#'     in simulating binary and non-normal data using the Fleishman transformation by:
#'
#'     1) allowing the continuous variables to be generated via Fleishman's third-order or Headrick's fifth-order transformation, and
#'
#'     2) allowing for ordinal variables with more than 2 categories.
#'
#'     Here, the intermediate correlation between Z1 and Z2 (where Z1 is the standard normal variable transformed using Headrick's
#'     fifth-order or Fleishman's third-order method to produce a continuous variable Y1, and Z2 is the standard normal variable
#'     discretized to produce an ordinal variable Y2) is calculated by dividing the target correlation by a correction factor.  The
#'     correction factor is the product of the point-polyserial correlation between Y2 and Z2 (described in Olsson et al., 1982,
#'     \doi{10.1007/BF02294164})
#'     and the power method correlation (described in Headrick & Kowalchuk, 2007, \doi{10.1080/10629360600605065}) between Y1 and Z1.
#'     The point-polyserial correlation is given by:
#'     \deqn{\rho_{y2,z2} = (1/\sigma_{y2})*\sum_{j = 1}^{r-1} \phi(\tau_{j})(y2_{j+1} - y2_{j})} where
#'     \deqn{\phi(\tau) = (2\pi)^{-1/2}*exp(-\tau^2/2)}  Here, \eqn{y_{j}} is the j-th support
#'     value and \eqn{\tau_{j}} is \eqn{\Phi^{-1}(\sum_{i=1}^{j} Pr(Y = y_{i}))}.  The power method correlation is given by:
#'     \deqn{\rho_{y1,z1} = c1 + 3c3 + 15c5} where c5 = 0 if \code{method} = "Fleishman".  The function is used in
#'     \code{\link[SimMultiCorrData]{findintercorr}} and
#'     \code{\link[SimMultiCorrData]{findintercorr2}}.  This function would not ordinarily be called by the user.
#'
#' @param method the method used to generate the k_cont continuous variables.  "Fleishman" uses a third-order polynomial transformation
#'     and "Polynomial" uses Headrick's fifth-order transformation.
#' @param constants a matrix with \code{k_cont} rows, each a vector of constants c0, c1, c2, c3 (if \code{method} = "Fleishman") or
#'     c0, c1, c2, c3, c4, c5 (if \code{method} = "Polynomial"), like that returned by
#'     \code{\link[SimMultiCorrData]{find_constants}}
#' @param rho_cont_cat a \code{k_cont x k_cat} matrix of target correlations among continuous and ordinal variables
#' @param marginal a list of length equal to \code{k_cat}; the i-th element is a vector of the cumulative
#'     probabilities defining the marginal distribution of the i-th variable;
#'     if the variable can take r values, the vector will contain r - 1 probabilities (the r-th is assumed to be 1)
#' @param support a list of length equal to \code{k_cat}; the i-th element is a vector of containing the r
#'     ordered support values
#' @export
#' @keywords intermediate, correlation, continuous, ordinal, Fleishman, Headrick
#' @seealso \code{\link[SimMultiCorrData]{power_norm_corr}}, \code{\link[SimMultiCorrData]{find_constants}},
#'     \code{\link[SimMultiCorrData]{findintercorr}}, \code{\link[SimMultiCorrData]{findintercorr2}}
#' @return a \code{k_cont x k_cat} matrix whose rows represent the \code{k_cont} continuous variables and columns represent
#'     the \code{k_cat} ordinal variables
#' @references
#' Demirtas H, Hedeker D, & Mermelstein RJ (2012). Simulation of massive public health data by power polynomials.
#'     Statistics in Medicine, 31(27): 3337-3346. \doi{10.1002/sim.5362}.
#'
#' Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. \doi{10.1007/BF02293811}.
#'
#' Headrick TC (2002). Fast Fifth-order Polynomial Transforms for Generating Univariate and Multivariate
#'     Non-normal Distributions. Computational Statistics & Data Analysis, 40(4):685-711. \doi{10.1016/S0167-9473(02)00072-5}.
#'     (\href{http://www.sciencedirect.com/science/article/pii/S0167947302000725}{ScienceDirect})
#'
#' Headrick TC (2004). On Polynomial Transformations for Simulating Multivariate Nonnormal Distributions.
#'     Journal of Modern Applied Statistical Methods, 3(1), 65-71. \doi{10.22237/jmasm/1083370080}.
#'
#' Headrick TC, Kowalchuk RK (2007). The Power Method Transformation: Its Probability Density Function, Distribution
#'     Function, and Its Further Use for Fitting Data. Journal of Statistical Computation and Simulation, 77, 229-249. \doi{10.1080/10629360600605065}.
#'
#' Headrick TC, Sawilowsky SS (1999). Simulating Correlated Non-normal Distributions: Extending the Fleishman Power
#'     Method. Psychometrika, 64, 25-35. \doi{10.1007/BF02294317}.
#'
#' Headrick TC, Sheng Y, & Hodis FA (2007). Numerical Computing and Graphics for the Power Method Transformation Using
#'     Mathematica. Journal of Statistical Software, 19(3), 1 - 17. \doi{10.18637/jss.v019.i03}.
#'
#' Olsson U, Drasgow F, & Dorans NJ (1982). The Polyserial Correlation Coefficient. Psychometrika, 47(3): 337-47.
#'     \doi{10.1007/BF02294164}.
#'
findintercorr_cont_cat <- function(method = c("Fleishman", "Polynomial"),
                                   constants, rho_cont_cat, marginal,
                                   support) {
  Sigma_cont_cat <- matrix(1, nrow = nrow(rho_cont_cat),
                           ncol = ncol(rho_cont_cat))
  for (i in 1:nrow(rho_cont_cat)) {
    for (j in 1:ncol(rho_cont_cat)) {
      Sigma_cont_cat[i, j] <-
        (rho_cont_cat[i, j] *
           sqrt(var_cat(marginal[[j]],
                        support[[j]])))/(denom_corr_cat(marginal[[j]],
                                                        support[[j]]) *
                                           power_norm_corr(constants[i, ],
                                                           method))
    }
  }
  return(Sigma_cont_cat)
}

Try the SimMultiCorrData package in your browser

Any scripts or data that you put into this service are public.

SimMultiCorrData documentation built on May 2, 2019, 9:50 a.m.