A tool for producing synthetic versions of microdata containing confidential information so that they are safe to be released to users for exploratory analysis. The key objective of generating synthetic data is to replace sensitive original values with synthetic ones causing minimal distortion of the statistical information contained in the data set. Variables, which can be categorical or continuous, are synthesised one-by-one using sequential modelling. Replacements are generated by drawing from conditional distributions fitted to the original data using parametric or classification and regression trees models. Data are synthesised via the function syn() which can be largely automated, if default settings are used, or with methods defined by the user. Optional parameters can be used to influence the disclosure risk and the analytical quality of the synthesised data. For a description of the implemented method see Nowok, Raab and Dibben (2016) <http://doi.org/10.18637/jss.v074.i11>.
|Author||Beata Nowok, Gillian M Raab, Joshua Snoke and Chris Dibben|
|Date of publication||2016-11-23 14:09:47|
|Maintainer||Beata Nowok <firstname.lastname@example.org>|
|License||GPL-2 | GPL-3|
compare: Comparison of synthesised and observed data
compare.fit.synds: Compare model estimates based on synthesised and observed...
compare.synds: Compare univariate distributions of synthesised and observed...
glm.synds: Fitting (generalized) linear models to synthetic data
multi.compare: Multivariate comparison of synthesised and observed data
read.obs: Importing original data sets form external files
replicated.uniques: Replications in synthetic data
SD2011: Social Diagnosis 2011 - Objective and Subjective Quality of...
sdc: Tools for statistical disclosure control (sdc)
summary.fit.synds: Inference from synthetic data
summary.synds: Synthetic data object summaries
syn: Generating synthetic data sets
syn.bag: Synthesis with bagging
syn.cart: Synthesis with classification and regression trees (CART)
syn.lognorm: Synthesis by linear regression after transformation of a...
syn.logreg: Synthesis by logistic regression
syn.nested: Synthesis for a variable nested within another variable.
syn.norm: Synthesis by linear regression
syn.normrank: Synthesis by normal linear regression preserving the marginal...
syn.passive: Passive synthesis
syn.pmm: Synthesis by predictive mean matching
syn.polr: Synthesis by ordered polytomous regression
syn.polyreg: Synthesis by unordered polytomous regression
syn.rf: Synthesis with random forest
syn.sample: Synthesis by simple random sampling
syn.survctree: Synthesis of survival time by classification and regression...
synthpop-package: Generating synthetic versions of sensitive microdata for...
tab.utility: [EXPERIMENTAL] Tabular utility
utility.synds: Distributional comparison of synthesised and observed data
write.syn: Exporting synthetic data sets to external files