R/nonnormsys.R
In SimRepeat: Simulation of Correlated Systems of Equations with Multiple Variable Types

#' @title Generate Correlated Systems of Equations Containing Normal, Non-Normal, and Mixture Continuous Variables
#'
#' @description This function extends the techniques of Headrick and Beasley (2004, \doi{10.1081/SAC-120028431}) to
#'     create correlated systems of statistical equations containing continuous variables with normal,
#'     non-normal, or mixture distributions.  The method allows the user to control the distributions
#'     for the stochastic disturbance (error) terms \eqn{E} and independent variables \eqn{X}.  The user specifies the correlation structure
#'     between \eqn{X} terms within an outcome and across outcomes.  For a given equation, the user also specifies the correlation between
#'     the outcome \eqn{Y} and \eqn{X} terms.  These correlations are used to calculate the beta (slope) coefficients for the equations with
#'     \code{\link[SimRepeat]{calc_betas}}.  If the system contains mixture variables and \code{corr.yx} is specified in terms of non-mixture
#'     and mixture variables, the \code{betas} will be calculated in terms of non-mixture and mixture independent variables.  If \code{corr.yx}
#'     Finally, the user specifies the correlations across error terms.  The assumptions are that
#'     1) there are at least 2 equations and a total of at least 1 independent variable, 2) the independent variables are uncorrelated
#'     with the error terms, 3) each equation has an error term,
#'     and 4) all error terms have either a non-mixture or mixture distribution.  The outcomes \eqn{Y} are calculated as the \eqn{E} terms
#'     added to the products of the beta coefficients and the \eqn{X} terms.  There are no
#'     interactions between independent variables or distinction between subject and group-level terms (as in the hierarchical linear models approach
#'     of \code{\link[SimRepeat]{corrsys}} or \code{\link[SimRepeat]{corrsys2}}).  However, the user does not have to provide the beta coefficients
#'     (except for the intercepts).  Since the equations do not include random slopes (i.e. for the \eqn{X} terms), the effects of the independent
#'     variables are all considered "fixed."  However, a random intercept or a "time" effect with a random slope could be added by specifying
#'     them as independent variables.  There are no parameter input checks in order to decrease simulation time.  All inputs should be checked
#'     prior to simulation with \code{\link[SimRepeat]{checkpar}}.  Summaries of the simulation results can be found with \code{\link[SimRepeat]{summary_sys}}.
#'     The functions \code{\link[SimRepeat]{calc_corr_y}}, \code{\link[SimRepeat]{calc_corr_yx}}, and \code{\link[SimRepeat]{calc_corr_ye}} use equations
#'     from Headrick and Beasley (2004) to calculate the expected correlations for the outcomes, among a given outcome and covariates from
#'     the other outcomes, and for the error terms.  The vignette \strong{Theory and Equations for Correlated Systems of Continuous Variables}
#'     gives the equations, and the vignette \strong{Correlated Systems of Statistical Equations with Non-Mixture and Mixture Continuous
#'     Variables} gives examples.  There are also vignettes in \code{\link[SimCorrMix]{SimCorrMix}} which provide more details on continuous
#'     non-mixture and mixture variables.
#'
#' @section Generation of Continuous Non-Mixture and Mixture Variables:
#'     Mixture distributions describe random variables that are drawn from more than one component distribution.  For a random variable
#'     \eqn{X_{mix}} from a finite continuous mixture distribution with \eqn{k} components, the probability density function (PDF) can be
#'     described by:
#'
#'     \deqn{h_X(x) = \sum_{i=1}^{k} \pi_i f_{X_{comp_i}}(x), \sum_{i=1}^{k} \pi_i = 1.}
#'
#'     The \eqn{\pi_i} are mixing parameters which determine the weight of each component distribution \eqn{f_{X_{comp_i}}(x)} in the overall
#'     probability distribution.  As long as each component has a valid PDF, the overall distribution \eqn{h_X()} has a valid PDF.
#'     The main assumption is statistical independence between the process of randomly selecting the component distribution and the
#'     distributions themselves.  Simulation is done at the component-level for mixture variables.
#'
#'     All continuous variables are simulated using either Fleishman's third-order (\code{method} = "Fleishman",
#'     \doi{10.1007/BF02293811}) or Headrick's fifth-order (\code{method} = "Polynomial", \doi{10.1016/S0167-9473(02)00072-5})
#'     power method transformation (PMT).  It works by matching standardized cumulants -- the first four (mean,
#'     variance, skew, and standardized kurtosis) for Fleishman's method, or the first six (mean, variance,
#'     skew, standardized kurtosis, and standardized fifth and sixth cumulants) for Headrick's method.  The transformation is expressed
#'     as follows:
#'
#'     \deqn{Y = c_{0} + c_{1} * Z + c_{2} * Z^2 + c_{3} * Z^3 + c_{4} * Z^4 + c_{5} * Z^5, Z \sim N(0,1),}
#'
#'     where \eqn{c_{4}} and \eqn{c_{5}} both equal \eqn{0} for Fleishman's method.  The real constants are calculated by
#'     \code{\link[SimMultiCorrData]{find_constants}} for non-mixture and components of mixture variables.  Continuous mixture
#'     variables are generated componentwise and then transformed to the desired mixture variables using random multinomial variables
#'     generated based on mixing probabilities.  The correlation matrices are specified in terms of correlations with components of the
#'     mixture variables.
#'
#' @section Choice of Fleishman's third-order or Headrick's fifth-order method:
#'     Using the fifth-order approximation allows additional control over the fifth and sixth moments of the generated distribution, improving
#'     accuracy.  In addition, the range of feasible standardized kurtosis values, given skew and standardized fifth (\eqn{\gamma_3}) and sixth
#'     (\eqn{\gamma_4}) cumulants, is larger than with Fleishman's method (see \code{\link[SimMultiCorrData]{calc_lower_skurt}}).
#'     For example, the Fleishman method can not be used to generate a non-normal distribution with a ratio of
#'     \eqn{\gamma_3^2/\gamma_4 > 9/14} (see Headrick & Kowalchuk, 2007).  This eliminates the Chi-squared family of distributions, which has
#'     a constant ratio of \eqn{\gamma_3^2/\gamma_4 = 2/3}.  The fifth-order method also generates more distributions with valid PDF's.
#'     However, if the fifth and sixth cumulants are unknown or do not exist, the Fleishman approximation should be used.
#'
#' @section Reasons for Function Errors:
#'     1) The most likely cause for function errors is that the parameter inputs are mispecified.  Using \code{\link[SimRepeat]{checkpar}}
#'     prior to simulation can help decrease these errors.
#'
#'     2) No solutions to \code{\link[SimMultiCorrData]{fleish}} or
#'     \code{\link[SimMultiCorrData]{poly}} converged when using \code{\link[SimMultiCorrData]{find_constants}}.  If this happens,
#'     the simulation will stop.  It may help to first use \code{\link[SimMultiCorrData]{find_constants}} for each continuous variable to
#'     determine if a sixth cumulant correction value is needed.  If the standardized cumulants are obtained from \code{calc_theory}, the user
#'     may need to use rounded values as inputs (i.e. \code{skews = round(skews, 8)}).  For example, in order to ensure that skew is exactly 0 for symmetric distributions.
#'
#'     3) The kurtosis for a continuous variable may be outside the region of possible values.  There is an associated lower kurtosis boundary for
#'     associated with a given skew (for Fleishman's method) or skew and fifth and sixth cumulants (for Headrick's method).  Use
#'     \code{\link[SimMultiCorrData]{calc_lower_skurt}} to determine the boundary for a given set of cumulants.
#'
#'     4) No solutions to \code{\link[SimRepeat]{calc_betas}} converged when trying to find the beta coefficients.  Try different correlation
#'     matrices.
#'
#' @param n the sample size (i.e. the length of each simulated variable; default = 10000)
#' @param M the number of dependent variables \eqn{Y} (outcomes); equivalently, the number of equations in the system
#' @param method the PMT method used to generate all continuous variables, including independent variables (covariates) and error terms;
#'     "Fleishman" uses Fleishman's third-order polynomial transformation and "Polynomial" uses Headrick's fifth-order transformation
#' @param error_type "non_mix" if all error terms have continuous non-mixture distributions, "mix" if all error terms have continuous mixture distributions
#' @param means a list of length \code{M} of vectors of means for the non-mixture (\eqn{X_{cont}}) and mixture (\eqn{X_{mix}}) independent variables
#'     and for the error terms (\eqn{E}); the order in each vector should be: \eqn{X_{cont}, X_{mix}, E} so that the order for
#'     \eqn{X_{cont}, X_{mix}} is
#'     the same as in \code{corr.x} (considering the components of mixture variables)
#' @param vars a list of length \code{M} of vectors of variances for \eqn{X_{cont}, X_{mix}, E}; same order and dimension as \code{means}
#' @param skews a list of length \code{M} of vectors of skew values for \eqn{X_{cont}} and \eqn{E} (if \code{error_type = "non_mix"});
#'     same order as in \code{corr.x} and \code{means}
#' @param skurts a list of length \code{M} of vectors of standardized kurtoses (kurtosis - 3) for \eqn{X_{cont}} and \eqn{E} (if \code{error_type = "non_mix"});
#'     same order and dimension as \code{skews}
#' @param fifths a list of length \code{M} of vectors of standardized fifth cumulants for \eqn{X_{cont}} and \eqn{E} (if \code{error_type = "non_mix"});
#'     same order and dimension as \code{skews}; not necessary for \code{method = "Fleishman"}
#' @param sixths a list of length \code{M} of vectors of standardized sixth cumulants for \eqn{X_{cont}} and \eqn{E} (if \code{error_type = "non_mix"});
#'     same order and dimension as \code{skews}; not necessary for \code{method = "Fleishman"}
#' @param Six a list of length \code{M}, where \code{Six[[p]][[j]]} is a vector of sixth cumulant correction values to aid in finding a valid PDF for
#'     \eqn{X_{cont(pj)}}, the j-th continuous non-mixture covariate for outcome \eqn{Y_p}; the last element of \code{Six[[p]]} is for \eqn{E_p}
#'     (if \code{error_type = "non_mix"}); use \code{Six[[p]][[j]] = NULL} if no correction desired for \eqn{X_{cont(pj)}};
#'     use \code{Six[[p]] = NULL} if no correction desired for any non-mixture covariate or error term in equation p;
#'     keep \code{Six = list()} if no corrections desired for all covariates or if \code{method = "Fleishman"}
#' @param mix_pis a list of length \code{M}, where \code{mix_pis[[p]][[j]]} is a vector of mixing probabilities that sum to 1 for \eqn{X_{mix(pj)}},
#'     the j-th continuous mixture covariate for outcome \eqn{Y_p}; the last element of \code{mix_pis[[p]]} is for \eqn{E_p} (if \code{error_type = "mix"});
#'     if \eqn{Y_p} has no mixture variables, use \code{mix_pis[[p]] = NULL}; components should be ordered as in \code{corr.x}
#' @param mix_mus a list of length \code{M}, where \code{mix_mus[[p]][[j]]} is a vector of means of the component distributions for
#'     \eqn{X_{mix(pj)}}; the last element of \code{mix_mus[[p]]} is for \eqn{E_p} (if \code{error_type = "mix"});
#'     if \eqn{Y_p} has no mixture variables, use \code{mix_mus[[p]] = NULL}
#' @param mix_sigmas a list of length \code{M}, where \code{mix_sigmas[[p]][[j]]} is a vector of standard deviations of the component distributions for
#'     \eqn{X_{mix(pj)}}; the last element of \code{mix_sigmas[[p]]} is for \eqn{E_p} (if \code{error_type = "mix"});
#'     if \eqn{Y_p} has no mixture variables, use \code{mix_sigmas[[p]] = NULL}
#' @param mix_skews a list of length \code{M}, where \code{mix_skews[[p]][[j]]} is a vector of skew values of the component distributions for
#'     \eqn{X_{mix(pj)}}; the last element of \code{mix_skews[[p]]} is for \eqn{E_p} (if \code{error_type = "mix"});
#'     if \eqn{Y_p} has no mixture variables, use \code{mix_skews[[p]] = NULL}
#' @param mix_skurts a list of length \code{M}, where \code{mix_skurts[[p]][[j]]} is a vector of standardized kurtoses of the component distributions for
#'     \eqn{X_{mix(pj)}}; the last element of \code{mix_skurts[[p]]} is for \eqn{E_p} (if \code{error_type = "mix"});
#'     if \eqn{Y_p} has no mixture variables, use \code{mix_skurts[[p]] = NULL}
#' @param mix_fifths a list of length \code{M}, where \code{mix_fifths[[p]][[j]]} is a vector of standardized fifth cumulants of the component distributions for
#'     \eqn{X_{mix(pj)}}; the last element of \code{mix_fifths[[p]]} is for \eqn{E_p} (if \code{error_type = "mix"});
#'     if \eqn{Y_p} has no mixture variables, use \code{mix_fifths[[p]] = NULL}; not necessary for \code{method = "Fleishman"}
#' @param mix_sixths a list of length \code{M}, where \code{mix_sixths[[p]][[j]]} is a vector of standardized sixth cumulants of the component distributions for
#'     \eqn{X_{mix(pj)}}; the last element of \code{mix_sixths[[p]]} is for \eqn{E_p} (if \code{error_type = "mix"});
#'     if \eqn{Y_p} has no mixture variables, use \code{mix_sixths[[p]] = NULL}; not necessary for \code{method = "Fleishman"}
#' @param mix_Six a list of length \code{M}, where \code{mix_Six[[p]]} is a list of length equal to the total number of component distributions for
#'     the \eqn{X_{mix(p)}} and \eqn{E_p} (if \code{error_type = "mix"});
#'     \code{mix_Six[[p]][[j]]} is a vector of sixth cumulant corrections for the j-th component distribution (i.e., if there are 2
#'     continuous mixture independent variables for \eqn{Y_p}, where \eqn{X_{mix(p1)}} has 2 components and \eqn{X_{mix(p2)}} has 3
#'     components, then \code{length(mix_Six[[p]]) = 5} and \code{mix_Six[[p]][[3]]} would correspond to the 1st component of
#'     \eqn{X_{mix(p2)}}); use \code{mix_Six[[p]][[j]] = NULL} if no correction desired for that component;
#'     use \code{mix_Six[[p]] = NULL} if no correction desired for any component of \eqn{X_{mix(p)}} and \eqn{E_p};
#'     keep \code{mix_Six = list()} if no corrections desired for all covariates or if \code{method = "Fleishman"}
#' @param corr.x list of length \code{M}, each component a list of length \code{M}; \code{corr.x[[p]][[q]]} is matrix of correlations
#'     for independent variables in equations p (\eqn{X_{(pj)}} for outcome \eqn{Y_p}) and q (\eqn{X_{(qj)}} for outcome \eqn{Y_q});
#'     order: 1st continuous non-mixture (same order as in \code{skews}) and 2nd components of continuous mixture (same order as in \code{mix_pis});
#'     if p = q, \code{corr.x[[p]][[q]]} is a correlation matrix with \code{nrow(corr.x[[p]][[q]])} = # of non-mixture + # of mixture components for outcome \eqn{Y_p};
#'     if p != q, \code{corr.x[[p]][[q]]} is a non-symmetric matrix of correlations where rows correspond to covariates for \eqn{Y_p}
#'     so that \code{nrow(corr.x[[p]][[q]])} = # of non-mixture + # of mixture components for outcome \eqn{Y_p} and
#'     columns correspond to covariates for \eqn{Y_q} so that \code{ncol(corr.x[[p]][[q]])} = # of non-mixture + # of mixture components for outcome \eqn{Y_q};
#'     use \code{corr.x[[p]][[q]] = NULL} if equation q has no \eqn{X_{(qj)}}; use \code{corr.x[[p]] = NULL} if equation p has no \eqn{X_{(pj)}}
#' @param corr.yx a list of length \code{M}, where the p-th component is a 1 row matrix of correlations between \eqn{Y_p} and \eqn{X_{(pj)}};
#'     if there are mixture variables and the \code{betas} are desired in terms of these (and not the components), then \code{corr.yx}
#'     should be specified in terms of correlations between outcomes and non-mixture or mixture variables, and the number of columns of the matrices
#'     of \code{corr.yx} should not match the dimensions of the matrices in \code{corr.x}; if the \code{betas} are desired in terms of
#'     the components, then \code{corr.yx} should be specified in terms of correlations between outcomes and non-mixture or components of
#'     mixture variables, and the number of columns of the matrices of \code{corr.yx} should match the dimensions of the matrices in \code{corr.x};
#'     use \code{corr.yx[[p]] = NULL} if equation p has no \eqn{X_{(pj)}}
#' @param corr.e correlation matrix for continuous non-mixture or components of mixture error terms
#' @param same.var either a vector or a matrix; if a vector, \code{same.var} includes column numbers of \code{corr.x[[1]][[1]]}
#'     corresponding to independent variables that should be identical across equations; these terms must have the same indices for all \code{p = 1, ..., M};
#'     i.e., if the 1st variable represents height which should be the same for each equation, then
#'     \code{same.var[1] = 1} and the 1st term for all other outcomes must also be height;
#'     if a matrix, columns 1 and 2 are outcome p and column index in \code{corr.x[[p]][[p]]} for 1st instance of variable,
#'     columns 3 and 4 are outcome q and column index in \code{corr.x[[q]][[q]]} for subsequent instances of variable; i.e., if
#'     1st term for all outcomes is height and \code{M = 3}, then \code{same.var = matrix(c(1,} \code{1, 2, 1, 1, 1, 3, 1), 2, 4, byrow = TRUE)}; the
#'     independent variable index corresponds to continuous non-mixture and component of continuous mixture covariate
#' @param betas.0 vector of length \code{M} containing intercepts, if \code{NULL} all set equal to 0; if length 1, all intercepts set to \code{betas.0}
#' @param seed the seed value for random number generation (default = 1234)
#' @param use.nearPD TRUE to convert the overall intermediate correlation matrix formed by the \eqn{X} (for all outcomes and independent
#'     variables) or \eqn{E} to the nearest positive definite matrix with \code{Matrix::nearPD} if necessary;
#'     if FALSE and \code{adjgrad = FALSE} the negative eigenvalues are replaced with \code{eigmin} if necessary
#' @param eigmin minimum replacement eigenvalue if overall intermediate correlation matrix is not positive-definite (default = 0)
#' @param adjgrad TRUE to use \code{adj_grad} to convert overall intermediate correlation matrix to a positive-definite matrix and next
#'     5 inputs can be used
#' @param B1 the initial matrix for algorithm; if NULL, uses a scaled initial matrix with diagonal elements \code{sqrt(nrow(Sigma))/2}
#' @param tau parameter used to calculate theta (default = 0.5)
#' @param tol maximum error for Frobenius norm distance between new matrix and original matrix (default = 0.1)
#' @param steps maximum number of steps for k (default = 100)
#' @param msteps maximum number of steps for m (default = 10)
#' @param errorloop if TRUE, uses \code{\link[SimCorrMix]{corr_error}} to attempt to correct the correlation of the independent
#'     variables within and across outcomes to be within \code{epsilon} of the target correlations \code{corr.x} until the number of iterations
#'     reaches \code{maxit} (default = FALSE)
#' @param epsilon the maximum acceptable error between the final and target correlation matrices (default = 0.001) in the error loop
#' @param maxit the maximum number of iterations to use (default = 1000) in the error loop
#' @param quiet if FALSE prints messages, if TRUE suppresses messages
#' @import SimMultiCorrData
#' @import SimCorrMix
#' @importFrom stats cor dbeta dbinom dchisq density dexp df dgamma dlnorm dlogis dmultinom dnbinom dnorm dpois dt dunif dweibull ecdf
#'     median pbeta pbinom pchisq pexp pf pgamma plnorm plogis pnbinom pnorm ppois pt punif pweibull qbeta qbinom qchisq qexp qf qgamma
#'     qlnorm qlogis qnbinom qnorm qpois qt quantile qunif qweibull rbeta rbinom rchisq rexp rf rgamma rlnorm rlogis rmultinom rnbinom
#'     rnorm rpois rt runif rweibull sd uniroot var
#' @import utils
#' @importFrom Matrix nearPD
#' @importFrom VGAM qzipois qzinegbin
#' @export
#' @return A list with the following components:
#'
#' @return \code{Y} matrix with \code{n} rows and \code{M} columns of outcomes
#'
#' @return \code{X} list of length \code{M} containing \eqn{X_{cont(pj)}, X_{comp(pj)}}
#'
#' @return \code{X_all} list of length \code{M} containing \eqn{X_{cont(pj)}, X_{mix(pj)}}
#'
#' @return \code{E} matrix with \code{n} rows containing continuous non-mixture or components of continuous mixture error terms
#'
#' @return \code{E_mix} matrix with \code{n} rows containing continuous mixture error terms
#'
#' @return \code{betas} a matrix of the slope coefficients calculated with \code{\link[SimRepeat]{calc_betas}}, rows represent the
#'     outcomes
#'
#' @return \code{constants} a list of length \code{M} with data.frames of the constants for the \eqn{X_{cont(pj)}},
#'     \eqn{X_comp(pj)} and \eqn{E_p}
#'
#' @return \code{SixCorr} a list of length \code{M} of lists of sixth cumulant correction values used to obtain valid
#'     \emph{pdf}'s for the \eqn{X_{cont(pj)}}, \eqn{X_comp(pj)}, and \eqn{E_p}
#'
#' @return \code{valid.pdf} a list of length \code{M} of vectors where the i-th element is "TRUE" if the constants for the i-th
#'         continuous variable generate a valid pdf, else "FALSE"
#'
#' @return \code{Sigma_X0} matrix of intermediate correlations calculated by \code{intercorr}
#'
#' @return \code{Sigma_X} matrix of intermediate correlations after \code{nearPD} or \code{adj_grad} function has been used;
#'     applied to generate the normal variables transformed to get the desired distributions
#'
#' @return \code{Error_Time} the time in minutes required to use the error loop
#'
#' @return \code{Time} the total simulation time in minutes
#'
#' @return \code{niter} a matrix of the number of iterations used in the error loop
#'
#' @keywords continuous mixture Headrick Beasley
#' @seealso \code{\link[SimRepeat]{calc_betas}}, \code{\link[SimRepeat]{calc_corr_y}}, \code{\link[SimRepeat]{calc_corr_yx}},
#'     \code{\link[SimRepeat]{calc_corr_ye}}, \code{\link[SimRepeat]{checkpar}}, \code{\link[SimRepeat]{summary_sys}}
#' @references
#' Davenport JW, Bezder JC, & Hathaway RJ (1988). Parameter Estimation for Finite Mixture Distributions.
#'     Computers & Mathematics with Applications, 15(10):819-28.
#'
#' Everitt BS (1996). An Introduction to Finite Mixture Distributions. Statistical Methods in Medical Research, 5(2):107-127. \doi{10.1177/096228029600500202}.
#'
#' Fialkowski AC (2017). SimMultiCorrData: Simulation of Correlated Data with Multiple Variable Types. R package version 0.2.1.
#'     \url{https://CRAN.R-project.org/package=SimMultiCorrData}.
#'
#' Fialkowski AC (2018). SimCorrMix: Simulation of Correlated Data of Multiple Variable Types including Continuous and Count
#'     Mixture Distributions. R package version 0.1.0. \url{https://github.com/AFialkowski/SimCorrMix}
#'
#' Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43:521-532. \doi{10.1007/BF02293811}.
#'
#' Headrick TC (2002). Fast Fifth-order Polynomial Transforms for Generating Univariate and Multivariate
#'     Non-normal Distributions. Computational Statistics & Data Analysis, 40(4):685-711. \doi{10.1016/S0167-9473(02)00072-5}.
#'     (\href{http://www.sciencedirect.com/science/article/pii/S0167947302000725}{ScienceDirect})
#'
#' Headrick TC (2004). On Polynomial Transformations for Simulating Multivariate Nonnormal Distributions.
#'     Journal of Modern Applied Statistical Methods, 3(1):65-71. \doi{10.22237/jmasm/1083370080}.
#'
#' Headrick TC, Beasley TM (2004).  A Method for Simulating Correlated Non-Normal Systems of Linear Statistical Equations.
#'     Communications in Statistics - Simulation and Computation, 33(1).  \doi{10.1081/SAC-120028431}
#'
#' Headrick TC, Kowalchuk RK (2007). The Power Method Transformation: Its Probability Density Function, Distribution
#'     Function, and Its Further Use for Fitting Data. Journal of Statistical Computation and Simulation, 77:229-249. \doi{10.1080/10629360600605065}.
#'
#' Headrick TC, Sawilowsky SS (1999). Simulating Correlated Non-normal Distributions: Extending the Fleishman Power
#'     Method. Psychometrika, 64:25-35. \doi{10.1007/BF02294317}.
#'
#' Headrick TC, Sheng Y, & Hodis FA (2007). Numerical Computing and Graphics for the Power Method Transformation Using
#'     Mathematica. Journal of Statistical Software, 19(3):1 - 17. \cr \doi{10.18637/jss.v019.i03}.
#'
#' Higham N (2002). Computing the nearest correlation matrix - a problem from finance; IMA Journal of Numerical Analysis 22:329-343.
#'
#' McCulloch CE, Searle SR, Neuhaus JM (2008). \emph{Generalized, Linear, and Mixed Models} (2nd ed.). Wiley Series in Probability and
#'     Statistics. Hoboken, New Jersey: John Wiley & Sons, Inc.
#'
#' Pearson RK (2011). Exploring Data in Engineering, the Sciences, and Medicine. In. New York: Oxford University Press.
#'
#' Schork NJ, Allison DB, & Thiel B (1996). Mixture Distributions in Human Genetics Research. Statistical Methods in Medical Research,
#'     5:155-178. \doi{10.1177/096228029600500204}.
#'
#' Vale CD & Maurelli VA (1983). Simulating Multivariate Nonnormal Distributions. Psychometrika, 48:465-471. \doi{10.1007/BF02293687}.
#'
#' @examples
#' M <- 3
#' B <- calc_theory("Beta", c(4, 1.5))
#' skews <- lapply(seq_len(M), function(x) c(0, B[3]))
#' skurts <- lapply(seq_len(M), function(x) c(0, B[4]))
#' fifths <- lapply(seq_len(M), function(x) c(0, B[5]))
#' sixths <- lapply(seq_len(M), function(x) c(0, B[6]))
#' Six <- lapply(seq_len(M), function(x) list(NULL, 0.03))
#' corr.e <- matrix(c(1, 0.4, 0.4^2, 0.4, 1, 0.4, 0.4^2, 0.4, 1), M, M,
#'   byrow = TRUE)
#' means <- lapply(seq_len(M), function(x) c(0, B[1]))
#' vars <- lapply(seq_len(M), function(x) c(1, B[2]^2))
#' corr.x <- list(list(matrix(1, 1, 1), matrix(0.4, 1, 1), matrix(0.4, 1, 1)),
#'   list(matrix(0.4, 1, 1), matrix(1, 1, 1), matrix(0.4, 1, 1)),
#'   list(matrix(0.4, 1, 1), matrix(0.4, 1, 1), matrix(1, 1, 1)))
#' corr.yx <- list(matrix(0.4, 1), matrix(0.5, 1), matrix(0.6, 1))
#' Sys1 <- nonnormsys(10000, M, "Polynomial", "non_mix", means, vars,
#'   skews, skurts, fifths, sixths, Six, corr.x = corr.x, corr.yx = corr.yx,
#'   corr.e = corr.e)
#'
#' \dontrun{
#' # Example: system of three equations for 2 independent variables, where each
#' # error term has unit variance, from Headrick & Beasley (2002)
#' # Y_1 = beta_10 + beta_11 * X_11 + beta_12 * X_12 + sigma_1 * e_1
#' # Y_2 = beta_20 + beta_21 * X_21 + beta_22 * X_22 + sigma_2 * e_2
#' # Y_3 = beta_30 + beta_31 * X_31 + beta_32 * X_32 + sigma_3 * e_3
#'
#' # X_11 = X_21 = X_31 = Exponential(2)
#' # X_12 = X_22 = X_32 = Laplace(0, 1)
#' # e_1 = e_2 = e_3 = Cauchy(0, 1)
#'
#' seed <- 1234
#' M <- 3
#' Stcum1 <- calc_theory("Exponential", 2)
#' Stcum2 <- calc_theory("Laplace", c(0, 1))
#' Stcum3 <- c(0, 1, 0, 25, 0, 1500) # taken from paper
#' means <- lapply(seq_len(M), function(x) c(0, 0, 0))
#' vars <- lapply(seq_len(M), function(x) c(1, 1, 1))
#' skews <- lapply(seq_len(M), function(x) c(Stcum1[3], Stcum2[3], Stcum3[3]))
#' skurts <- lapply(seq_len(M), function(x) c(Stcum1[4], Stcum2[4], Stcum3[4]))
#' fifths <- lapply(seq_len(M), function(x) c(Stcum1[5], Stcum2[5], Stcum3[5]))
#' sixths <- lapply(seq_len(M), function(x) c(Stcum1[6], Stcum2[6], Stcum3[6]))
#'
#' # No sixth cumulant corrections will be used in order to match the results
#' # from the paper.
#'
#' corr.yx <- list(matrix(c(0.4, 0.4), 1), matrix(c(0.5, 0.5), 1),
#'   matrix(c(0.6, 0.6), 1))
#' corr.x <- list()
#' corr.x[[1]] <- corr.x[[2]] <- corr.x[[3]] <- list()
#' corr.x[[1]][[1]] <- matrix(c(1, 0.1, 0.1, 1), 2, 2)
#' corr.x[[1]][[2]] <- matrix(c(0.1974318, 0.1859656, 0.1879483, 0.1858601),
#'   2, 2, byrow = TRUE)
#' corr.x[[1]][[3]] <- matrix(c(0.2873190, 0.2589830, 0.2682057, 0.2589542),
#'   2, 2, byrow = TRUE)
#' corr.x[[2]][[1]] <- t(corr.x[[1]][[2]])
#' corr.x[[2]][[2]] <- matrix(c(1, 0.35, 0.35, 1), 2, 2)
#' corr.x[[2]][[3]] <- matrix(c(0.5723303, 0.4883054, 0.5004441, 0.4841808),
#'   2, 2, byrow = TRUE)
#' corr.x[[3]][[1]] <- t(corr.x[[1]][[3]])
#' corr.x[[3]][[2]] <- t(corr.x[[2]][[3]])
#' corr.x[[3]][[3]] <- matrix(c(1, 0.7, 0.7, 1), 2, 2)
#' corr.e <- matrix(0.4, nrow = 3, ncol = 3)
#' diag(corr.e) <- 1
#'
#' # Check the parameter inputs
#' checkpar(M, "Polynomial", "non_mix", means, vars, skews,
#'   skurts, fifths, sixths, corr.x = corr.x, corr.yx = corr.yx,
#'   corr.e = corr.e)
#' # Generate the system
#' Sys1 <- nonnormsys(10000, M, "Polynomial", "non_mix", means, vars, skews,
#'   skurts, fifths, sixths, corr.x = corr.x, corr.yx = corr.yx,
#'   corr.e = corr.e, seed = seed)
#' # Summarize the results
#' Sum1 <- summary_sys(Sys1$Y, Sys1$E, E_mix = NULL, Sys1$X, X_all = list(), M,
#'   "Polynomial", means, vars, skews, skurts, fifths, sixths, corr.x = corr.x,
#'   corr.e = corr.e)
#'
#' # Calculate theoretical correlations for comparison to simulated values
#' calc_corr_y(Sys1$betas, corr.x, corr.e, vars)
#' Sum1$rho.y
#' calc_corr_ye(Sys1$betas, corr.x, corr.e, vars)
#' Sum1$rho.ye
#' calc_corr_yx(Sys1$betas, corr.x, vars)
#' Sum1$rho.yx
#' }
nonnormsys <- function(n = 10000, M = NULL,
                       method = c("Fleishman", "Polynomial"),
                       error_type = c("non_mix", "mix"),
                       means = list(), vars = list(), skews = list(),
                       skurts = list(), fifths = list(), sixths = list(),
                       Six = list(), mix_pis = list(), mix_mus = list(),
                       mix_sigmas = list(), mix_skews =  list(),
                       mix_skurts =  list(), mix_fifths =  list(),
                       mix_sixths =  list(), mix_Six = list(),
                       same.var = NULL, betas.0 = NULL, corr.x = list(),
                       corr.yx = list(), corr.e = NULL, seed = 1234,
                       use.nearPD = TRUE, eigmin = 0, adjgrad = FALSE,
                       B1 = NULL, tau = 0.5, tol = 0.1, steps = 100,
                       msteps = 10, errorloop = FALSE, epsilon = 0.001,
                       maxit = 1000, quiet = FALSE) {
  start.time <- Sys.time()
  if (length(error_type) != 1)
    stop("Please choose one type of distribution for all of the error terms:
         mix if all errors have continuous mixture distributions,
         non_mix if all errors have continuous non-mixture distributions.")
  K.mix <- rep(0, M)
  K.comp <- rep(0, M)
  K.error <- rep(0, M)
  if (length(mix_pis) > 0) {
    K.mix <- lengths(mix_pis)
    K.comp <- sapply(lapply(mix_pis, unlist), length)
  }
  k.comp <- c(0, cumsum(K.comp))
  K.cont <- lengths(vars) - K.mix
  k.cont <- c(0, cumsum(K.cont))
  k.mix <- c(0, cumsum(K.mix))
  if (error_type == "mix") {
    K.error <- sapply(mix_pis, function(x) lengths(x[length(x)]))
    k.error <- c(0, cumsum(K.error))
    K.x <- K.cont + K.comp - K.error
    K.mix2 <- K.mix - 1
  } else {
    K.x <- K.cont + K.comp - 1
    K.mix2 <- K.mix
  }
  if (is.null(betas.0)) betas.0 <- rep(0, M)
  if (length(betas.0) == 1) betas.0 <- rep(betas.0, M)
  if (length(betas.0) != M)
    stop("Length of intercept vector betas.0 should be M!")
  same.cont <- NULL
  if (!is.null(same.var)) {
    if (class(same.var) == "numeric") {
      for (j in 1:length(same.var)) {
        same.cont <- rbind(same.cont, cbind(rep(1, M - 1),
          rep(same.var[j], M - 1), 2:M, rep(same.var[j], M - 1)))
      }
      same.var <- same.cont
    } else same.cont <- same.var
  }
  csame.dist <- NULL
  csame.dist2 <- NULL
  skews2 <- NULL
  skurts2 <- NULL
  fifths2 <- NULL
  sixths2 <- NULL
  Six2 <- list()
  if (length(skews) > 0) {
    skews2 <- unlist(skews)
    skurts2 <- unlist(skurts)
    if (method == "Polynomial") {
      fifths2 <- unlist(fifths)
      sixths2 <- unlist(sixths)
      Six2 <- unlist(Six, recursive = FALSE)
    }
    if (length(skews2) > 1) {
      for (i in 2:length(skews2)) {
        if (skews2[i] %in% skews2[1:(i - 1)]) {
          csame <- which(skews2[1:(i - 1)] == skews2[i])
          for (j in 1:length(csame)) {
            if (method == "Polynomial") {
              if ((skurts2[i] == skurts2[csame[j]]) &
                  (fifths2[i] == fifths2[csame[j]]) &
                  (sixths2[i] == sixths2[csame[j]])) {
                csame.dist <- rbind(csame.dist, c(csame[j], i))
                break
              }
            }
            if (method == "Fleishman") {
              if (skurts2[i] == skurts2[csame[j]]) {
                csame.dist <- rbind(csame.dist, c(csame[j], i))
                break
              }
            }
          }
        }
      }
      if (!is.null(csame.dist)) {
        ind.p <- sapply(csame.dist[, 1],
                        function(x) min(which(k.cont >= x)) - 1)
        ind.q <- sapply(csame.dist[, 2],
                        function(x) min(which(k.cont >= x)) - 1)
        ind.p2 <- numeric(length(ind.p))
        ind.q2 <- numeric(length(ind.q))
        for (i in 1:length(ind.p)) {
          ind.p2[i] <- csame.dist[i, 1] - k.cont[ind.p[i]]
          ind.q2[i] <- csame.dist[i, 2] - k.cont[ind.q[i]]
        }
        csame.dist2 <- cbind(ind.p, ind.p2, ind.q, ind.q2)
      }
    }
  }
  msame.dist2 <- NULL
  if (length(mix_pis) > 0) {
    mix_skews2 <- unlist(mix_skews)
    mix_skurts2 <- unlist(mix_skurts)
    if (method == "Polynomial") {
      mix_fifths2 <- unlist(mix_fifths)
      mix_sixths2 <- unlist(mix_sixths)
    }
    for (i in 1:length(mix_skews2)) {
      msame.d <- NULL
      if (length(skews) > 0) {
        if (mix_skews2[i] %in% skews2) {
          msame <- which(skews2 == mix_skews2[i])
          for (j in 1:length(msame)) {
            if (method == "Polynomial") {
              if ((mix_skurts2[i] == skurts2[msame[j]]) &
                  (mix_fifths2[i] == fifths2[msame[j]]) &
                  (mix_sixths2[i] == sixths2[msame[j]])) {
                msame.d <- c(msame[j], i)
                msame.d <- c(min(which(k.cont >= msame.d[1])) - 1,
                  msame.d[1] - k.cont[min(which(k.cont >= msame.d[1])) - 1],
                  min(which(k.comp >= msame.d[2])) - 1,
                  msame.d[2] - k.comp[min(which(k.comp >= msame.d[2])) - 1])
                msame.d[4] <- msame.d[4] + K.cont[msame.d[3]]
                break
              }
            }
            if (method == "Fleishman") {
              if (mix_skurts2[i] == skurts2[msame[j]]) {
                msame.d <- c(msame[j], i)
                msame.d <- c(min(which(k.cont >= msame.d[1])) - 1,
                  msame.d[1] - k.cont[min(which(k.cont >= msame.d[1])) - 1],
                  min(which(k.comp >= msame.d[2])) - 1,
                  msame.d[2] - k.comp[min(which(k.comp >= msame.d[2])) - 1])
                msame.d[4] <- msame.d[4] + K.cont[msame.d[3]]
                break
              }
            }
          }
        }
      }
      if (is.null(msame.d) & i >= 2) {
        if (mix_skews2[i] %in% mix_skews2[1:(i-1)]) {
          msame <- which(mix_skews2[1:(i - 1)] == mix_skews2[i])
          for (j in 1:length(msame)) {
            if (method == "Polynomial") {
              if ((mix_skurts2[i] == mix_skurts2[msame[j]]) &
                  (mix_fifths2[i] == mix_fifths2[msame[j]]) &
                  (mix_sixths2[i] == mix_sixths2[msame[j]])) {
                msame.d <- c(msame[j], i)
                msame.d <- c(min(which(k.comp >= msame.d[1])) - 1,
                  msame.d[1] - k.comp[min(which(k.comp >= msame.d[1])) - 1],
                  min(which(k.comp >= msame.d[2])) - 1,
                  msame.d[2] - k.comp[min(which(k.comp >= msame.d[2])) - 1])
                msame.d[2] <- msame.d[2] + K.cont[msame.d[1]]
                msame.d[4] <- msame.d[4] + K.cont[msame.d[3]]
                break
              }
            }
            if (method == "Fleishman") {
              if (mix_skurts2[i] == mix_skurts2[msame[j]]) {
                msame.d <- c(msame[j], i)
                msame.d <- c(min(which(k.comp >= msame.d[1])) - 1,
                  msame.d[1] - k.comp[min(which(k.comp >= msame.d[1])) - 1],
                  min(which(k.comp >= msame.d[2])) - 1,
                  msame.d[2] - k.comp[min(which(k.comp >= msame.d[2])) - 1])
                msame.d[2] <- msame.d[2] + K.cont[msame.d[1]]
                msame.d[4] <- msame.d[4] + K.cont[msame.d[3]]
                break
              }
            }
          }
        }
      }
      msame.dist2 <- rbind(msame.dist2, msame.d)
    }
    mix_skews2 <- lapply(mix_skews, unlist)
    mix_skurts2 <- lapply(mix_skurts, unlist)
    if (method == "Polynomial") {
      mix_fifths2 <- lapply(mix_fifths, unlist)
      mix_sixths2 <- lapply(mix_sixths, unlist)
    }
  }
  constants <- list()
  SixCorr <- list()
  Valid.PDF <- list()
  for (i in 1:M) {
    SixCorr[[i]] <- numeric(K.cont[i] + K.comp[i])
    Valid.PDF[[i]] <- numeric(K.cont[i] + K.comp[i])
    if (method == "Fleishman") {
      constants[[i]] <- matrix(NA, nrow = K.cont[i] + K.comp[i], ncol = 4)
      colnames(constants[[i]]) <- c("c0", "c1", "c2", "c3")
    }
    if (method == "Polynomial") {
      constants[[i]] <- matrix(NA, nrow = K.cont[i] + K.comp[i], ncol = 6)
      colnames(constants[[i]]) <- c("c0", "c1", "c2", "c3", "c4", "c5")
    }
    if (K.cont[i] > 0) {
      for (j in 1:K.cont[i]) {
        if (!is.null(csame.dist2)) {
          rind <- which(csame.dist2[, 3] == i & csame.dist2[, 4] == j)
          if (length(rind) > 0) {
            constants[[i]][j, ] <-
              constants[[csame.dist2[rind, 1]]][csame.dist2[rind, 2], ]
            SixCorr[[i]][j] <-
              SixCorr[[csame.dist2[rind, 1]]][csame.dist2[rind, 2]]
            Valid.PDF[[i]][j] <-
              Valid.PDF[[csame.dist2[rind, 1]]][csame.dist2[rind, 2]]
          }
        }
        if (sum(is.na(constants[[i]][j, ])) > 0) {
          if (length(Six) == 0) Six2 <- NULL else
            if (length(Six[[i]]) == 0) Six2 <- NULL else
              Six2 <- Six[[i]][[j]]
          cons <-
            suppressWarnings(find_constants(method, skews[[i]][j],
              skurts[[i]][j], fifths[[i]][j], sixths[[i]][j],
              Six = Six2, n = 25, seed = seed))
          if (length(cons) == 1 | is.null(cons)) {
            stop(paste("Constants can not be found for outcome ", i,
                       " non-mixture variable ", j, ".", sep = ""))
          }
          SixCorr[[i]][j] <- ifelse(is.null(cons$SixCorr1), NA, cons$SixCorr1)
          Valid.PDF[[i]][j] <- cons$valid
          constants[[i]][j, ] <- cons$constants
        }
      }
    }
  }
  for (i in 1:M) {
    if (K.mix[i] > 0) {
      for (j in (K.cont[i] + 1):(K.cont[i] + K.comp[i])) {
        if (!is.null(msame.dist2)) {
          rind <- which(msame.dist2[, 3] == i & msame.dist2[, 4] == j)
          if (length(rind) > 0) {
            constants[[i]][j, ] <-
              constants[[msame.dist2[rind, 1]]][msame.dist2[rind, 2], ]
            SixCorr[[i]][j] <-
              SixCorr[[msame.dist2[rind, 1]]][msame.dist2[rind, 2]]
            Valid.PDF[[i]][j] <-
              Valid.PDF[[msame.dist2[rind, 1]]][msame.dist2[rind, 2]]
          }
        }
        if (sum(is.na(constants[[i]][j, ])) > 0) {
          if (length(mix_Six) == 0) mix_Six2 <- NULL else
            if (length(mix_Six[[i]]) == 0) mix_Six2 <- NULL else
              mix_Six2 <- mix_Six[[i]][[j - K.cont[i]]]
          cons <-
            suppressWarnings(find_constants(method,
              mix_skews2[[i]][j - K.cont[i]], mix_skurts2[[i]][j - K.cont[i]],
              mix_fifths2[[i]][j - K.cont[i]], mix_sixths2[[i]][j - K.cont[i]],
              Six = mix_Six2, n = 25, seed = seed))
          if (length(cons) == 1 | is.null(cons)) {
            stop(paste("Constants can not be found for outcome ", i,
                 " mixture component ", j - K.cont[i], ".", sep = ""))
          }
          SixCorr[[i]][j] <- ifelse(is.null(cons$SixCorr1), NA, cons$SixCorr1)
          Valid.PDF[[i]][j] <- cons$valid
          constants[[i]][j, ] <- cons$constants
        }
      }
    }
  }
  if (error_type == "non_mix" & max(K.mix2) > 0) {
    for (i in 1:M) {
      constants[[i]] <- rbind(constants[[i]][-K.cont[i], ],
        constants[[i]][K.cont[i], ])
    }
  }
  e.constants <- NULL
  if (error_type == "non_mix") {
    e.means <- mapply('[[', means, lengths(means))
    e.vars <- mapply('[[', vars, lengths(vars))
    for (i in 1:M) {
      e.constants <- rbind(e.constants, constants[[i]][nrow(constants[[i]]), ])
    }
  } else {
    e.means <- unlist(mapply('[', mix_mus, lengths(mix_mus)))
    e.vars <- (unlist(mapply('[', mix_sigmas, lengths(mix_sigmas))))^2
    for (i in 1:M) {
      e.constants <- rbind(e.constants,
        constants[[i]][(nrow(constants[[i]]) - K.error[i] +
                          1):nrow(constants[[i]]), ])
    }
  }
  Sigma_E <- intercorr_cont(method, e.constants, corr.e)
  set.seed(seed)
  if (!is.null(same.var)) {
    Z <- matrix(rnorm((ncol(Sigma_E) + sum(K.x) - nrow(same.var)) * n), n)
  } else Z <- matrix(rnorm((ncol(Sigma_E) + sum(K.x)) * n), n)
  Z <- scale(Z, TRUE, FALSE)
  Z <- Z %*% svd(Z, nu = 0)$v
  Z <- scale(Z, FALSE, TRUE)
  if (min(eigen(Sigma_E, symmetric = TRUE)$values) < 0) {
    if (quiet == FALSE)
      message("Intermediate E correlation matrix is not positive definite.")
    if (use.nearPD == TRUE) {
      Sigma_E <- as.matrix(nearPD(Sigma_E, corr = T, keepDiag = T)$mat)
    } else if (adjgrad == TRUE) {
      sadj <- adj_grad(Sigma_E, B1, tau, tol, steps, msteps)
      Sigma_E <- sadj$Sigma2
    }
  }
  eig <- eigen(Sigma_E, symmetric = TRUE)
  sqrteigval <- diag(sqrt(pmax(eig$values, 0)), nrow(Sigma_E))
  eigvec <- eig$vectors
  fry <- eigvec %*% sqrteigval
  E <- fry %*% t(Z[, (ncol(Z) - ncol(Sigma_E) + 1):ncol(Z)])
  E <- t(E)
  E2 <- matrix(1, nrow = n, ncol = ncol(Sigma_E))
  for (i in 1:ncol(Sigma_E)) {
    if (method == "Fleishman") {
      E2[, i] <- e.constants[i, 1] + e.constants[i, 2] * E[, i] +
        e.constants[i, 3] * E[, i]^2 + e.constants[i, 4] * E[, i]^3
    }
    if (method == "Polynomial") {
      E2[, i] <- e.constants[i, 1] + e.constants[i, 2] * E[, i] +
        e.constants[i, 3] * E[, i]^2 + e.constants[i, 4] * E[, i]^3 +
        e.constants[i, 5] * E[, i]^4 + e.constants[i, 6] * E[, i]^5
    }
    E2[, i] <- e.means[i] + sqrt(e.vars[i]) * E2[, i]
  }
  E_comp <- E2
  colnames(E_comp) <- paste("E", 1:ncol(E_comp), sep = "")
  if (error_type == "mix") {
    E2 <- matrix(1, n, M)
    for (i in 1:M) {
      seed <- seed + 1
      set.seed(seed)
      R <- rmultinom(n, size = 1, prob = mix_pis[[i]][[length(mix_pis[[i]])]])
      E2[, i] <- apply(t(R) * E_comp[, (k.error[i] + 1):k.error[i + 1]], 1,
                       sum)
      E2[, i] <- scale(E2[, i])
      E2[, i] <- means[[i]][length(means[[i]])] +
        sqrt(vars[[i]][length(vars[[i]])]) * E2[, i]
    }
  }
  colnames(E2) <- paste("E", 1:M, sep = "")
  constants0 <- constants
  constants2 <- NULL
  for (p in 1:M) {
    if (K.x[p] == 0) next
    rownames(corr.x[[p]][[p]]) <- paste("cont", p, "_", 1:K.x[p], sep = "")
    if (!is.null(same.cont)) {
      if (p %in% same.cont[, 3])
        constants[[p]] <-
          constants[[p]][-(same.cont[same.cont[, 3] == p, 4]), , drop = FALSE]
    }
    if (error_type == "non_mix") {
      constants2 <- rbind(constants2,
        constants[[p]][-nrow(constants[[p]]), , drop = FALSE])
    } else {
      constants2 <- rbind(constants2,
        constants[[p]][-c((nrow(constants[[p]]) - K.error[p] +
                             1):nrow(constants[[p]])), , drop = FALSE])
    }
    if (!is.null(same.var)) {
      same.var0 <- same.var[same.var[, 3] == p, , drop = FALSE]
      if (p != 1 & nrow(same.var0) != 0) {
        q <- unique(same.var0[, 1])
        for (i in 1:length(q)) {
          rownames(corr.x[[p]][[p]])[same.var0[same.var0[, 1] == q[i], 4]] <-
            rownames(corr.x[[q[i]]][[q[i]]])[same.var0[same.var0[, 1] == q[i],
                                                       2]]
        }
      }
    }
    colnames(corr.x[[p]][[p]]) <- rownames(corr.x[[p]][[p]])
  }
  names0 <- list()
  for (p in 1:M) {
    if (length(corr.x[[p]]) == 0) {
      names0 <- append(names0, list(NULL))
      next
    } else {
      names0[[p]] <- colnames(corr.x[[p]][[p]])
      for (q in 1:M) {
        if (length(corr.x[[p]][[q]]) == 0) next
        colnames(corr.x[[p]][[q]]) <- colnames(corr.x[[q]][[q]])
        rownames(corr.x[[p]][[q]]) <- rownames(corr.x[[p]][[p]])
      }
    }
  }
  corr.x0 <- corr.x
  Corr_X <- list()
  if (!is.null(same.var)) {
    for (p in 1:(M - 1)) {
      if (K.x[p] == 0) next
      for (q in (p + 1):M) {
        if (K.x[q] == 0) next
        same.var0 <- same.var[(same.var[, 1] == p & same.var[, 3] == q), ,
                              drop = FALSE]
        if (nrow(same.var0) != 0) {
          for (i in 1:M) {
            if (K.x[i] == 0) next
            for (j in 1:M) {
              if (K.x[j] == 0) next
              if (i == q & j != q)
                corr.x0[[i]][[j]] <- corr.x0[[i]][[j]][-same.var0[, 4], ,
                                                       drop = FALSE]
              if (i != q & j == q)
                corr.x0[[i]][[j]] <- corr.x0[[i]][[j]][, -same.var0[, 4],
                                                       drop = FALSE]
              if (i == q & j == q)
                corr.x0[[i]][[j]] <- corr.x0[[i]][[j]][-same.var0[, 4],
                  -same.var0[, 4], drop = FALSE]
            }
          }
        }
      }
    }
  }
  for (p in 1:M) {
    Corr_X <- append(Corr_X, list(NULL))
    if (length(corr.x0[[p]]) > 0) {
      for (q in 1:M) {
        if (length(corr.x0[[p]][[q]]) == 0) next else
          if (nrow(corr.x0[[p]][[q]]) == 0 |
              ncol(corr.x0[[p]][[q]]) == 0) next else
                Corr_X[[p]] <- cbind(Corr_X[[p]], corr.x0[[p]][[q]])
      }
    }
  }
  Corr_X <- do.call(rbind, Corr_X)
  names1 <- colnames(Corr_X)
  Sigma_X <- intercorr_cont(method = method, constants = constants2,
                            rho_cont = Corr_X)
  colnames(Sigma_X) <- colnames(Corr_X)
  rownames(Sigma_X) <- rownames(Corr_X)
  Sigma_X0 <- Sigma_X
  if (min(eigen(Sigma_X, symmetric = TRUE)$values) < 0) {
    if (quiet == FALSE)
      message("Intermediate correlation matrix is not positive definite.")
    if (use.nearPD == TRUE) {
      Sigma_X <- as.matrix(nearPD(Sigma_X, corr = T, keepDiag = T)$mat)
    } else if (adjgrad == TRUE) {
      sadj <- adj_grad(Sigma_X, B1, tau, tol, steps, msteps)
      Sigma_X <- sadj$Sigma2
    }
  }
  eig <- eigen(Sigma_X, symmetric = TRUE)
  sqrteigval <- diag(sqrt(pmax(eig$values, 0)), nrow(Sigma_X))
  eigvec <- eig$vectors
  fry <- eigvec %*% sqrteigval
  X <- fry %*% t(Z[, 1:ncol(Sigma_X), drop = FALSE])
  X <- t(X)
  colnames(X) <- colnames(Corr_X)
  colnames(Sigma_X) <- colnames(Corr_X)
  rownames(Sigma_X) <- rownames(Corr_X)
  X_cont <- list()
  for (i in 1:M) {
    if (K.x[i] != 0) {
      if (i == 1) {
        X_cont[[i]] <- X[, 1:K.x[i], drop = FALSE]
      } else {
        if (!is.null(same.var)) {
          if (K.x[i] == nrow(same.var[same.var[, 3] == i, , drop = FALSE])) {
            X_cont <- append(X_cont, list(NULL))
          } else {
            X_cont[[i]] <- X[, (cumsum(K.x)[i - 1] -
              nrow(same.var[same.var[, 3] <= (i - 1), , drop = FALSE]) +
              1):(cumsum(K.x)[i] - nrow(same.var[same.var[, 3] <= i, ,
                                               drop = FALSE])), drop = FALSE]
          }
        } else {
          X_cont[[i]] <- X[, (cumsum(K.x)[i - 1] + 1):cumsum(K.x)[i],
                       drop = FALSE]
        }
      }
    } else X_cont <- append(X_cont, list(NULL))
  }
  Y_cont <- list()
  means2 <- list()
  vars2 <- list()
  temp_means <- NULL
  temp_vars <- NULL
  for (m in 1:M) {
    Y_cont <- append(Y_cont, list(NULL))
    means2 <- append(means2, list(NULL))
    vars2 <- append(vars2, list(NULL))
    if (K.x[m] == 0 | length(grep("cont", colnames(X_cont[[m]]))) == 0)
      next else {
      if (error_type == "non_mix") {
        if ((K.cont[m] - 1) > 0) {
          means2[[m]] <- means[[m]][1:(K.cont[m] - 1)]
          vars2[[m]] <- vars[[m]][1:(K.cont[m] - 1)]
        }
        if (K.mix2[m] > 0) {
          means2[[m]] <- c(means2[[m]], unlist(mix_mus[[m]]))
          vars2[[m]] <- c(vars2[[m]], (unlist(mix_sigmas[[m]]))^2)
        }
      }
      if (error_type == "mix") {
        if (K.cont[m] > 0) {
          means2[[m]] <- means[[m]][1:K.cont[m]]
          vars2[[m]] <- vars[[m]][1:K.cont[m]]
        }
        if (K.mix2[m] > 0) {
          means2[[m]] <- c(means2[[m]],
            unlist(mix_mus[[m]][-length(mix_mus[[m]])]))
          vars2[[m]] <- c(vars2[[m]],
            (unlist(mix_sigmas[[m]][-length(mix_sigmas[[m]])]))^2)
        }
      }
      if (!is.null(same.cont)) {
        if (m %in% same.cont[, 3]) {
          means2[[m]] <- means2[[m]][-(same.cont[same.cont[, 3] == m, 4])]
          vars2[[m]] <- vars2[[m]][-(same.cont[same.cont[, 3] == m, 4])]
        }
      }
      Y_cont[[m]] <- matrix(1, nrow = n, ncol = ncol(X_cont[[m]]))
      for (i in 1:ncol(X_cont[[m]])) {
        if (method == "Fleishman") {
          Y_cont[[m]][, i] <- constants[[m]][i, 1] +
            constants[[m]][i, 2] * X_cont[[m]][, i] +
            constants[[m]][i, 3] * X_cont[[m]][, i]^2 +
            constants[[m]][i, 4] * X_cont[[m]][, i]^3
        }
        if (method == "Polynomial") {
          Y_cont[[m]][, i] <- constants[[m]][i, 1] +
            constants[[m]][i, 2] * X_cont[[m]][, i] +
            constants[[m]][i, 3] * X_cont[[m]][, i]^2 +
            constants[[m]][i, 4] * X_cont[[m]][, i]^3 +
            constants[[m]][i, 5] * X_cont[[m]][, i]^4 +
            constants[[m]][i, 6] * X_cont[[m]][, i]^5
        }
        Y_cont[[m]][, i] <- means2[[m]][i] +
          sqrt(vars2[[m]][i]) * Y_cont[[m]][, i]
      }
      colnames(Y_cont[[m]]) <- colnames(X_cont[[m]])
      temp_means <- c(temp_means, means2[[m]])
      temp_vars <- c(temp_vars, vars2[[m]])
    }
  }
  Y_all <- NULL
  for (p in 1:M) {
    Y_all <- cbind(Y_all, Y_cont[[p]])
  }
  rho.x0 <- cor(Y_all)
  emax0 <- max(abs(rho.x0 - Corr_X))
  niter <- diag(0, nrow(rho.x0), ncol(rho.x0))
  start.time.error <- Sys.time()
  if (emax0 > epsilon & errorloop == TRUE) {
    EL <- corr_error(n = n, k_cont = length(grep("cont", colnames(Y_all))),
      method = method, means = temp_means, vars = temp_vars,
      constants = constants2, seed = seed, epsilon = epsilon, maxit = maxit,
      rho0 = Corr_X, Sigma = Sigma_X, rho_calc = rho.x0)
    Y_all <- EL$Y_cont
    colnames(Y_all) <- colnames(Corr_X)
    niter <- EL$niter
    colnames(niter) <- colnames(Corr_X)
    rownames(niter) <- colnames(Corr_X)
    Sigma_X <- EL$Sigma
  }
  stop.time.error <- Sys.time()
  Y_cont <- list()
  for (i in 1:M) {
    if (K.x[i] != 0) {
      if (i == 1) {
        Y_cont[[i]] <- Y_all[, 1:K.x[i], drop = FALSE]
      } else {
        if (!is.null(same.var)) {
          if (K.x[i] == nrow(same.var[same.var[, 3] == i, , drop = FALSE])) {
            Y_cont <- append(Y_cont, list(NULL))
          } else {
            Y_cont[[i]] <- Y_all[, (cumsum(K.x)[i - 1] -
              nrow(same.var[same.var[, 3] <= (i - 1), , drop = FALSE]) +
              1):(cumsum(K.x)[i] - nrow(same.var[same.var[, 3] <= i, ,
              drop = FALSE])), drop = FALSE]
          }
        } else {
          Y_cont[[i]] <- Y_all[, (cumsum(K.x)[i - 1] + 1):cumsum(K.x)[i],
                           drop = FALSE]
        }
      }
    } else Y_cont <- append(Y_cont, list(NULL))
  }
  for (m in 1:M) {
    if (K.x[m] > 0) {
      if (!is.null(same.cont)) {
        if (nrow(same.cont[same.cont[, 3] == m, , drop = FALSE]) > 0) {
          temp.cont <- same.cont[same.cont[, 3] == m, , drop = FALSE]
          for (i in 1:nrow(temp.cont)) {
            Y_cont[[m]] <- cbind(Y_cont[[m]],
                                 Y_cont[[temp.cont[i, 1]]][, temp.cont[i, 2]])
            colnames(Y_cont[[m]])[ncol(Y_cont[[m]])] <-
              colnames(Y_cont[[temp.cont[i, 1]]])[temp.cont[i, 2]]
          }
        }
      }
      if (length(Y_cont[[m]]) > 0) {
        Y_cont[[m]] <- Y_cont[[m]][, names0[[m]], drop = FALSE]
      }
    }
  }
  Y_comp <- Y_cont
  if (max(K.mix2) > 0) {
    for (i in 1:M) {
      if (K.mix2[i] != 0) {
        if ((ncol(Y_cont[[i]]) - K.comp[i] + K.error[i]) > 0) {
          Y_cont[[i]] <- cbind(Y_cont[[i]][, 1:(ncol(Y_cont[[i]]) -
            K.comp[i] + K.error[i]), drop = FALSE], matrix(1, n, K.mix2[i]))
        } else {
          Y_cont[[i]] <- matrix(1, n, K.mix2[i])
        }
        for (j in 1:K.mix2[i]) {
          ind <- K.x[i] - K.comp[i] + K.error[i] + j
          ind2 <- c(0, cumsum(sapply(mix_pis[[i]], length)))
          if (!is.null(same.cont)) {
            if (nrow(same.cont[same.cont[, 3] == i, , drop = FALSE]) > 0) {
              temp.cont <- same.cont[same.cont[, 3] == i, , drop = FALSE]
              if (all((ind - j + 1 + ind2[j]):(ind - j + ind2[j + 1]) %in%
                      temp.cont[, 4])) {
                temp.cont <- temp.cont[temp.cont[, 4] %in%
                  (ind - j + 1 + ind2[j]):(ind - j + ind2[j + 1]), ,
                                       drop = FALSE]
                if (length(unique(temp.cont[, 1])) == 1 &
                    length(unique(temp.cont[, 2])) ==
                    length(mix_pis[[i]][[j]])) {
                  ind <- ncol(Y_comp[[i]]) - K.comp[i] + K.error[i] + j
                  ind3 <- cumsum(sapply(mix_pis[[temp.cont[1, 1]]], length))
                  ind4 <- ncol(Y_cont[[temp.cont[1, 1]]]) -
                    K.mix2[temp.cont[1, 1]] +
                    min(which(ind3 >= max(temp.cont[1, 2])))
                  Y_cont[[i]][, ind] <- Y_cont[[temp.cont[1, 1]]][, ind4,
                                                                  drop = FALSE]
                  colnames(Y_cont[[i]])[ind] <-
                    colnames(Y_cont[[temp.cont[1, 1]]])[ind4]
                  next
                }
              }
            }
          }
          seed <- seed + 1
          set.seed(seed)
          R <- rmultinom(n, size = 1, prob = mix_pis[[i]][[j]])
          ind <- ncol(Y_comp[[i]]) - K.comp[i] + K.error[i] + j
          Y_cont[[i]][, ind] <- apply(t(R) *
            Y_comp[[i]][, (ind - j + 1 + ind2[j]):(ind - j + ind2[j + 1]),
                                                    drop = FALSE], 1, sum)
          Y_cont[[i]][, ind] <- means[[i]][ind] +
            sqrt(vars[[i]][ind]) * scale(Y_cont[[i]][, ind])
          colnames(Y_cont[[i]])[ind] <- paste("mix", i, "_", j, sep = "")
        }
      }
    }
  }
  betas <- calc_betas(corr.yx, corr.x, vars, mix_pis, mix_mus, mix_sigmas,
    error_type, n = 25, seed = seed)
  if (length(betas) == 1 | is.null(betas)) {
    stop("The beta coefficients could not be found.")
  }
  for (p in 1:M) {
    if (K.x[p] == 0) next
    colnames(Y_comp[[p]]) <- paste("X", p, "_", 1:K.x[p], sep = "")
    colnames(Y_cont[[p]]) <- paste("X", p, "_", 1:ncol(Y_cont[[p]]), sep = "")
  }
  Y <- matrix(1, nrow = n, ncol = M)
  colnames(Y) <- paste("Y", 1:M, sep = "")
  for (i in 1:M) {
    if (K.x[i] == 0) {
      Y[, i] <- matrix(betas.0[i], n, 1) + E2[, i]
    } else if ((length(betas[i, ]) - sum(betas[i, ] == 0)) == K.x[i]) {
      Y[, i] <- matrix(betas.0[i], n, 1) +
        Y_comp[[i]] %*% matrix(betas[i, 1:K.x[i]], ncol = 1) + E2[, i]
    } else {
      Y[, i] <- matrix(betas.0[i], n, 1) + Y_cont[[i]] %*%
        matrix(betas[i, 1:ncol(Y_cont[[i]])], ncol = 1) + E2[, i]
    }
  }
  stop.time <- Sys.time()
  Time.error <- round(difftime(stop.time.error, start.time.error,
                               units = "min"), 3)
  Time <- round(difftime(stop.time, start.time, units = "min"), 3)
  cat("Total Simulation time:", Time, "minutes \n")
  result <- list(Y = Y, X = Y_comp, X_all = Y_cont, E = E_comp, betas = betas,
    Sigma_X0 = Sigma_X0, Sigma_X = Sigma_X, constants = constants0,
    SixCorr = SixCorr, valid.pdf = Valid.PDF, niter = niter,
    Error_Time = Time.error, Time = Time)
  if (error_type == "mix") result <- append(result, list(E_mix = E2))
  result
}
Any scripts or data that you put into this service are public.
SimRepeat documentation built on May 2, 2019, 9:32 a.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
SimRepeat
Simulation of Correlated Systems of Equations with Multiple Variable Types

R/nonnormsys.R
In SimRepeat: Simulation of Correlated Systems of Equations with Multiple Variable Types

Try the SimRepeat package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

SimRepeat Simulation of Correlated Systems of Equations with Multiple Variable Types

R/nonnormsys.R In SimRepeat: Simulation of Correlated Systems of Equations with Multiple Variable Types

Try the SimRepeat package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

SimRepeat
Simulation of Correlated Systems of Equations with Multiple Variable Types

R/nonnormsys.R
In SimRepeat: Simulation of Correlated Systems of Equations with Multiple Variable Types