Description Usage Arguments Value Reasons for Function Errors References See Also Examples
This function calculates the lower and upper correlation bounds for the given distributions and
checks if a given target correlation matrix rho
is within the bounds. It should be used before simulation with
corrvar
. However, even if all pairwise correlations fall within the bounds, it is still possible
that the desired correlation matrix is not feasible. This is particularly true when ordinal variables (r ≥ 2 categories) are
generated or negative correlations are desired. Therefore, this function should be used as a general check to eliminate pairwise correlations that are obviously
not reproducible. It will help prevent errors when executing the simulation. The ordering of the variables in rho
must be 1st ordinal, 2nd continuous nonmixture, 3rd components of continuous mixture, 4th regular Poisson, 5th zeroinflated
Poisson, 6th regular NB, and 7th zeroinflated NB. Note that it is possible for k_cat
, k_cont
, k_mix
,
k_pois
, and/or k_nb
to be 0. The target correlations are specified with respect to the components of the continuous
mixture variables. There are no parameter input checks in order to decrease simulation time. All inputs should be checked prior to simulation with
validpar
.
Please see the Comparison of Correlation Methods 1 and 2 vignette for the differences between the two correlation methods, and the Variable Types vignette for a detailed explanation of how the correlation boundaries are calculated.
1 2 3 4 5 6 7 8 9  validcorr(n = 10000, k_cat = 0, k_cont = 0, k_mix = 0, k_pois = 0,
k_nb = 0, method = c("Fleishman", "Polynomial"), means = NULL,
vars = NULL, skews = NULL, skurts = NULL, fifths = NULL,
sixths = NULL, Six = list(), mix_pis = list(), mix_mus = list(),
mix_sigmas = list(), mix_skews = list(), mix_skurts = list(),
mix_fifths = list(), mix_sixths = list(), mix_Six = list(),
marginal = list(), lam = NULL, p_zip = 0, size = NULL, prob = NULL,
mu = NULL, p_zinb = 0, rho = NULL, seed = 1234, use.nearPD = TRUE,
quiet = FALSE)

n 
the sample size (i.e. the length of each simulated variable; default = 10000) 
k_cat 
the number of ordinal (r >= 2 categories) variables (default = 0) 
k_cont 
the number of continuous nonmixture variables (default = 0) 
k_mix 
the number of continuous mixture variables (default = 0) 
k_pois 
the number of regular Poisson and zeroinflated Poisson variables (default = 0) 
k_nb 
the number of regular Negative Binomial and zeroinflated Negative Binomial variables (default = 0) 
method 
the method used to generate the k_cont nonmixture and k_mix mixture continuous variables. "Fleishman" uses Fleishman's thirdorder polynomial transformation and "Polynomial" uses Headrick's fifthorder transformation. 
means 
a vector of means for the k_cont nonmixture and k_mix mixture continuous variables
(i.e. 
vars 
a vector of variances for the k_cont nonmixture and k_mix mixture continuous variables
(i.e. 
skews 
a vector of skewness values for the 
skurts 
a vector of standardized kurtoses (kurtosis  3, so that normal variables have a value of 0)
for the 
fifths 
a vector of standardized fifth cumulants for the 
sixths 
a vector of standardized sixth cumulants for the 
Six 
a list of vectors of sixth cumulant correction values for the 
mix_pis 
a list of length 
mix_mus 
a list of length 
mix_sigmas 
a list of length 
mix_skews 
a list of length 
mix_skurts 
a list of length 
mix_fifths 
a list of length 
mix_sixths 
a list of length 
mix_Six 
a list of length 
marginal 
a list of length equal to 
lam 
a vector of lambda (> 0) constants for the Poisson variables (see 
p_zip 
a vector of probabilities of structural zeros (not including zeros from the Poisson distribution) for the
zeroinflated Poisson variables (see 
size 
a vector of size parameters for the Negative Binomial variables (see 
prob 
a vector of success probability parameters for the NB variables; order the same as in 
mu 
a vector of mean parameters for the NB variables (*Note: either 
p_zinb 
a vector of probabilities of structural zeros (not including zeros from the NB distribution) for the zeroinflated NB variables
(see 
rho 
the target correlation matrix which must be ordered
1st ordinal, 2nd continuous nonmixture, 3rd components of continuous mixtures, 4th regular Poisson, 5th zeroinflated Poisson,
6th regular NB, 7th zeroinflated NB; note that 
seed 
the seed value for random number generation (default = 1234) 
use.nearPD 
TRUE to convert 
quiet 
if FALSE prints messages, if TRUE suppresses message printing 
A list with components:
rho
the target correlation matrix, which will differ from the supplied matrix (if provided) if it was converted to
the nearest positivedefinite matrix
L_rho
the lower correlation bound
U_rho
the upper correlation bound
If continuous variables are desired, additional components are:
constants
the calculated constants
sixth_correction
a vector of the sixth cumulant correction values
valid.pdf
a vector with ith component equal to "TRUE" if variable Y_i has a valid power method PDF, else "FALSE"
If a target correlation matrix rho
is provided, each pairwise correlation is checked to see if it is within the lower and upper
bounds. If the correlation is outside the bounds, the indices of the variable pair are given.
valid.rho
TRUE if all entries of rho
are within the bounds, else FALSE
1) The most likely cause for function errors is that no solutions to fleish
or
poly
converged when using find_constants
. If this happens,
the function will stop. It may help to first use find_constants
for each continuous variable to
determine if a sixth cumulant correction value is needed. If the standardized cumulants are obtained from calc_theory
,
the user may need to use rounded values as inputs (i.e. skews = round(skews, 8)
). For example, in order to ensure that skew
is exactly 0 for symmetric distributions.
2) The kurtosis may be outside the region of possible values. There is an associated lower boundary for kurtosis associated
with a given skew (for Fleishman's method) or skew and fifth and sixth cumulants (for Headrick's method). Use
calc_lower_skurt
to determine the boundary for a given set of cumulants.
Please see references for SimCorrMix
.
find_constants
, corrvar
, validpar
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69  validcorr(n = 1000, k_cat = 1, k_cont = 1, method = "Polynomial",
means = 0, vars = 1, skews = 0, skurts = 0, fifths = 0, sixths = 0,
marginal = list(c(1/3, 2/3)), rho = matrix(c(1, 0.4, 0.4, 1), 2, 2),
quiet = TRUE)
## Not run:
# 2 continuous mixture, 1 binary, 1 zeroinflated Poisson, and
# 1 zeroinflated NB variable
n < 10000
seed < 1234
# Mixture variables: Normal mixture with 2 components;
# mixture of Logistic(0, 1), Chisq(4), Beta(4, 1.5)
# Find cumulants of components of 2nd mixture variable
L < calc_theory("Logistic", c(0, 1))
C < calc_theory("Chisq", 4)
B < calc_theory("Beta", c(4, 1.5))
skews < skurts < fifths < sixths < NULL
Six < list()
mix_pis < list(c(0.4, 0.6), c(0.3, 0.2, 0.5))
mix_mus < list(c(2, 2), c(L[1], C[1], B[1]))
mix_sigmas < list(c(1, 1), c(L[2], C[2], B[2]))
mix_skews < list(rep(0, 2), c(L[3], C[3], B[3]))
mix_skurts < list(rep(0, 2), c(L[4], C[4], B[4]))
mix_fifths < list(rep(0, 2), c(L[5], C[5], B[5]))
mix_sixths < list(rep(0, 2), c(L[6], C[6], B[6]))
mix_Six < list(list(NULL, NULL), list(1.75, NULL, 0.03))
Nstcum < calc_mixmoments(mix_pis[[1]], mix_mus[[1]], mix_sigmas[[1]],
mix_skews[[1]], mix_skurts[[1]], mix_fifths[[1]], mix_sixths[[1]])
Mstcum < calc_mixmoments(mix_pis[[2]], mix_mus[[2]], mix_sigmas[[2]],
mix_skews[[2]], mix_skurts[[2]], mix_fifths[[2]], mix_sixths[[2]])
means < c(Nstcum[1], Mstcum[1])
vars < c(Nstcum[2]^2, Mstcum[2]^2)
marginal < list(0.3)
support < list(c(0, 1))
lam < 0.5
p_zip < 0.1
size < 2
prob < 0.75
p_zinb < 0.2
k_cat < k_pois < k_nb < 1
k_cont < 0
k_mix < 2
Rey < matrix(0.39, 8, 8)
diag(Rey) < 1
rownames(Rey) < colnames(Rey) < c("O1", "M1_1", "M1_2", "M2_1", "M2_2",
"M2_3", "P1", "NB1")
# set correlation between components of the same mixture variable to 0
Rey["M1_1", "M1_2"] < Rey["M1_2", "M1_1"] < 0
Rey["M2_1", "M2_2"] < Rey["M2_2", "M2_1"] < Rey["M2_1", "M2_3"] < 0
Rey["M2_3", "M2_1"] < Rey["M2_2", "M2_3"] < Rey["M2_3", "M2_2"] < 0
# check parameter inputs
validpar(k_cat, k_cont, k_mix, k_pois, k_nb, "Polynomial", means,
vars, skews, skurts, fifths, sixths, Six, mix_pis, mix_mus, mix_sigmas,
mix_skews, mix_skurts, mix_fifths, mix_sixths, mix_Six, marginal, support,
lam, p_zip, size, prob, mu = NULL, rho = Rey)
# check to make sure Rey is within the feasible correlation boundaries
validcorr(n, k_cat, k_cont, k_mix, k_pois, k_nb, "Polynomial", means,
vars, skews, skurts, fifths, sixths, Six, mix_pis, mix_mus, mix_sigmas,
mix_skews, mix_skurts, mix_fifths, mix_sixths, mix_Six, marginal,
lam, p_zip, size, prob, mu = NULL, p_zinb, Rey, seed)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.