Home

/

GitHub

/

bonorico/gcipdr

/

DataRebuild: IPD reconstruction from IPD summaries only.

DataRebuild: IPD reconstruction from IPD summaries only.
In bonorico/gcipdr: Gaussian Copula (based) Individual Person Data (IPD) Reconstruction

Description Usage Arguments Details Value Note See Also Examples

View source: R/ipd_rec.R

'DataRebuild()' generates artificial data, that is stochastic copies of the original IPD, by taking empirical IPD distributional summaries as input data only.

DataRebuild(H, n, correlation.matrix, moments, x.mode,
  johnson.parameters = NULL, stochastic.integration = FALSE,
  data.rearrange = c("incomplete", "norta"), corrtype = c("rank.corr",
  "moment.corr", "normal.corr"), marg.model = c("gamma", "johnson"),
  variable.names = NULL, SBjohn.correction = F, compute.eec = F,
  checkdata = F, tabulate.similar.data = FALSE,
  SI_k = 8000, input.sn.corr = NULL)

`H`	integer number of independent IPD replicates to be generated.
`n`	integer number of independent IPD records. Ex: number of rows (subjects) in original IPD.
`correlation.matrix`	pairwise IPD correlations values.
`moments`	numeric array of IPD marginal moments up to fourth degree for all IPD variables (columns).
`x.mode`	logical vector: is IPD marginal variable binary (TRUE) or not ?
`johnson.parameters`	array of Johnson parameters for each IPD marginal variable. Depends on CRAN archived 'JohnsonDistribution' package. If NULL it is computed on given 'moments'.
`stochastic.integration`	logical: should Monte Carlo integration be used to resolve Gaussian copula inversion (NORTA transformation)? Default to FALSE, that is numerical integration relying on package 'cubature' is used first.
`data.rearrange`	method of IPD dependence reconstruction based on all pairwise IPD correlations (norta), or on first degree correlations only (incomplete).
`corrtype`	what type of IPD correlation matrix are you feeding in ? Spearman (rank.corr), Pearson (moment.corr), or Waerden (normal.corr). see Details
`marg.model`	either "gamma" or "johnson" for modeling of non-binary IPD marginal. All binary marginals are modeled via a Bernoulli distribution, or a Beta distribution if Kruskal analytic conversion is used (see below).
`variable.names`	names of IPD marginal variables. If NULL (Default) automatic labels are generated.
`SBjohn.correction`	logical. Should be Johnson marginal values corrected ? Default to FALSE. If TRUE, wrongly sampled negative values are set to the minimum positive sampled value.
`compute.eec`	currently deprecated. Do not edit default value.
`checkdata`	logical: if TRUE it compares the IPD summary (marginal moments and pairwise correlations) averages over the H IPD reconstructions against the original IPD summary input values.
`tabulate.similar.data`	if TRUE and also checkdata = TRUE it returns the full tabular comparison between the reconstructed and original IPD summaries.
`SI_k`	resampling size of stochastic integration approach. Default to 8000.
`NI_tol`	error tolerance for numerical integration. Default 1e-02, do not decrease too much. As reasonable max value use 1e-05.
`NI_maxEval`	max number of evaluations during numerical integration. Default 500 (instead 0 implies infinite number of evaluations)..
`input.sn.corr`	solution of 'correlation.matrix' into standard normal space (the Gaussian copula parameter, see Details). Default is NULL and solution is found internally. If matrix solution is instead given, it overrides internal computations and it is directly used to generate artificial data. This can be useful as a post hoc data generation tuning procedure. See Details.
`cp.finetune`	logical. If NORTA method is used and x.mode = TRUE, it iteratively fine-tunes Kruskal analytic solution (corrtype = rank.corr) of copula parameter, until the correlation bias of the generated artificial data is reduced. It can also be used along with argument 'assume.all.smooth = TRUE' (see below). Default FALSE.
`rescale.smoothed.binary`	if Kruskal analytic conversion was used and x.mode = T, it rescales smoothed binary variables into integer format (typically needed). Default FALSE.
`assume.all.smooth`	logical. If NORTA method is used, it pretends an input Pearson correlation matrix is already a valid Kruskal solution, which falsely assumes all variables are continuous, when some are actually discrete. This is biased but it can yield quick (fine-tunable – see 'cp.finetune'). Default FALSE.

'DataRebuild()' is based on a Gaussian Copula inversion technique also known as NORmal To Anything (NORTA) transformation. Inversion occurs upon conversion of an input empirical matrix into standard normal space (copula parameter solution). If data.rearrange = "norta", conversion (optimization) expects a Pearson correlation matrix as input (corrtype = "moment.corr" is chosen automatically default). Using "norta" and "rank.corr" performs Kruskal analytic conversion (theoretically valid if all marginals are continous), whereas "normal.corr" simply returns the input matrix as it is. If optimization fails with numerical integration (default), try stochastic integration (stochastic.integration = TRUE) instead.

An object of class 'similar.data'.

this program currently assumes that previous to calculation of the input IPD summaries every IPD categorical variable with m levels was first converted to m-1 dummy (binary) variables. As an alternative one can, in the future, allow for categorical marginals as well and use a Multinomial distribution modeling. This program relies on archived package 'JohnsonDistribution'.

[Return.key.IPD.summaries()] for allowed input IPD summary format, [FitJohnsonDistribution()] from archived package JohnsonDistribution, [adaptIntegrate()] from package cubature

## Not run: 
DataRebuild( H = 100, n = 1000 )

## End(Not run)

help("gcipdr")

bonorico/gcipdr documentation built on May 2, 2021, 8:12 p.m.

bonorico/gcipdr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

bonorico/gcipdr
Gaussian Copula (based) Individual Person Data (IPD) Reconstruction

DataRebuild: IPD reconstruction from IPD summaries only.
In bonorico/gcipdr: Gaussian Copula (based) Individual Person Data (IPD) Reconstruction

Description

Usage

Arguments

Details

Value

Note

See Also

Examples

Related to DataRebuild in bonorico/gcipdr...

R Package Documentation

Browse R Packages

We want your feedback!

bonorico/gcipdr Gaussian Copula (based) Individual Person Data (IPD) Reconstruction

DataRebuild: IPD reconstruction from IPD summaries only. In bonorico/gcipdr: Gaussian Copula (based) Individual Person Data (IPD) Reconstruction

Description

Usage

Arguments

Details

Value

Note

See Also

Examples

Related to DataRebuild in bonorico/gcipdr...

R Package Documentation

Browse R Packages

We want your feedback!

bonorico/gcipdr
Gaussian Copula (based) Individual Person Data (IPD) Reconstruction

DataRebuild: IPD reconstruction from IPD summaries only.
In bonorico/gcipdr: Gaussian Copula (based) Individual Person Data (IPD) Reconstruction