sdc | R Documentation |
Labeling, top and bottom coding, smoothing numeric data, and
removing different types of unique records defined by keys from synthetic data.
The function calls replicated.uniques
to identify the rows
to be excluded from the synthetic data set(s)
sdc(object, data,keys = NULL, prefix = NULL, suffix = NULL, label = NULL,
rm.uniques.in.orig = FALSE, rm.replicated.uniques = FALSE,
recode.vars = NULL, bottom.top.coding = NULL,
recode.exclude = NULL, smooth.vars = NULL)
object |
an object of class |
data |
the original (observed) data set. |
keys |
Variables to be used as quasi-identifiers to check for unique
combinations. Passed to |
prefix |
A character string to be added as a prefix to all variable names in the synthetic data set(s) |
suffix |
A character string to be added as a suffix to all variable names in the synthetic data set(s) |
label |
a single string with a label to be added to the synthetic data sets as a new variable to make it clear that the data are synthetic/fake. |
rm.uniques.in.orig |
a logical value indicating whether unique replicates of key variables that are present in the orginal data set should be removed from synthetic data set(s). |
rm.replicated.uniques |
a logical value indicating whether unique replicates of key variables that are also unique in the orginal data set should be removed. |
recode.vars |
a single string or a vector of strings with name(s) of variable(s) to be bottom- or/and top-coded. |
bottom.top.coding |
a list of two-element vectors specifing
bottom and top codes for each variable in |
recode.exclude |
a list specifying for each variable in
|
smooth.vars |
a single string or a vector of strings with name(s)
of numeric variable(s) to be smoothed ( |
An object
provided as an argument adjusted in accordance with the
other parameters' values.
replicated.uniques
ods <- SD2011[1:1000,c("sex","age","region","edu","marital","income")]
s1 <- syn(ods, m = 2)
s1.sdc <- sdc(s1, ods, keys = c("sex","age","region"),suffix = "_synthetic",
label="false_data", rm.uniques.in.orig = TRUE,
recode.vars = c("age","income"),
bottom.top.coding = list(c(20,80),c(NA,2000)),
recode.exclude = list(NA,c(NA,-8)))
head(s1.sdc$syn[[2]])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.