Description Usage Arguments Details Value Note See Also Examples
'DataRebuild()' generates artificial data, that is stochastic copies of the original IPD, by taking empirical IPD distributional summaries as input data only.
1 2 3 4 5 6 7 | DataRebuild(H, n, correlation.matrix, moments, x.mode,
johnson.parameters = NULL, stochastic.integration = FALSE,
data.rearrange = c("incomplete", "norta"), corrtype = c("rank.corr",
"moment.corr", "normal.corr"), marg.model = c("gamma", "johnson"),
variable.names = NULL, SBjohn.correction = F, compute.eec = F,
checkdata = F, tabulate.similar.data = FALSE,
SI_k = 8000, input.sn.corr = NULL)
|
H |
integer number of independent IPD replicates to be generated. |
n |
integer number of independent IPD records. Ex: number of rows (subjects) in original IPD. |
correlation.matrix |
pairwise IPD correlations values. |
moments |
numeric array of IPD marginal moments up to fourth degree for all IPD variables (columns). |
x.mode |
logical vector: is IPD marginal variable binary (TRUE) or not ? |
johnson.parameters |
array of Johnson parameters for each IPD marginal variable. Depends on CRAN archived 'JohnsonDistribution' package. If NULL it is computed on given 'moments'. |
stochastic.integration |
logical: should Monte Carlo integration be used to resolve Gaussian copula inversion (NORTA transformation)? Default to FALSE, that is numerical integration relying on package 'cubature' is used first. |
data.rearrange |
method of IPD dependence reconstruction based on all pairwise IPD correlations (norta), or on first degree correlations only (incomplete). |
corrtype |
what type of IPD correlation matrix are you feeding in ? Spearman (rank.corr), Pearson (moment.corr), or Waerden (normal.corr). see Details |
marg.model |
either "gamma" or "johnson" for modeling of non-binary IPD marginal. All binary marginals are modeled via a Bernoulli distribution, or a Beta distribution if Kruskal analytic conversion is used (see below). |
variable.names |
names of IPD marginal variables. If NULL (Default) automatic labels are generated. |
SBjohn.correction |
logical. Should be Johnson marginal values corrected ? Default to FALSE. If TRUE, wrongly sampled negative values are set to the minimum positive sampled value. |
compute.eec |
currently deprecated. Do not edit default value. |
checkdata |
logical: if TRUE it compares the IPD summary (marginal moments and pairwise correlations) averages over the H IPD reconstructions against the original IPD summary input values. |
tabulate.similar.data |
if TRUE and also checkdata = TRUE it returns the full tabular comparison between the reconstructed and original IPD summaries. |
SI_k |
resampling size of stochastic integration approach. Default to 8000. |
NI_tol |
error tolerance for numerical integration. Default 1e-02, do not decrease too much. As reasonable max value use 1e-05. |
NI_maxEval |
max number of evaluations during numerical integration. Default 500 (instead 0 implies infinite number of evaluations).. |
input.sn.corr |
solution of 'correlation.matrix' into standard normal space (the Gaussian copula parameter, see Details). Default is NULL and solution is found internally. If matrix solution is instead given, it overrides internal computations and it is directly used to generate artificial data. This can be useful as a post hoc data generation tuning procedure. See Details. |
cp.finetune |
logical. If NORTA method is used and x.mode = TRUE, it iteratively fine-tunes Kruskal analytic solution (corrtype = rank.corr) of copula parameter, until the correlation bias of the generated artificial data is reduced. It can also be used along with argument 'assume.all.smooth = TRUE' (see below). Default FALSE. |
rescale.smoothed.binary |
if Kruskal analytic conversion was used and x.mode = T, it rescales smoothed binary variables into integer format (typically needed). Default FALSE. |
assume.all.smooth |
logical. If NORTA method is used, it pretends an input Pearson correlation matrix is already a valid Kruskal solution, which falsely assumes all variables are continuous, when some are actually discrete. This is biased but it can yield quick (fine-tunable – see 'cp.finetune'). Default FALSE. |
'DataRebuild()' is based on a Gaussian Copula inversion technique also known as NORmal To Anything (NORTA) transformation. Inversion occurs upon conversion of an input empirical matrix into standard normal space (copula parameter solution). If data.rearrange = "norta", conversion (optimization) expects a Pearson correlation matrix as input (corrtype = "moment.corr" is chosen automatically default). Using "norta" and "rank.corr" performs Kruskal analytic conversion (theoretically valid if all marginals are continous), whereas "normal.corr" simply returns the input matrix as it is. If optimization fails with numerical integration (default), try stochastic integration (stochastic.integration = TRUE) instead.
An object of class 'similar.data'.
this program currently assumes that previous to calculation of the input IPD summaries every IPD categorical variable with m levels was first converted to m-1 dummy (binary) variables. As an alternative one can, in the future, allow for categorical marginals as well and use a Multinomial distribution modeling. This program relies on archived package 'JohnsonDistribution'.
[Return.key.IPD.summaries()] for allowed input IPD summary format, [FitJohnsonDistribution()] from archived package JohnsonDistribution, [adaptIntegrate()] from package cubature
1 2 3 4 5 6 | ## Not run:
DataRebuild( H = 100, n = 1000 )
## End(Not run)
help("gcipdr")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.