compare.fit.synds  R Documentation 
The same model that was used for the synthesised data set is fitted to the
observed data set. The coefficients with confidence intervals for the
observed data is plotted together with their estimates from synthetic data.
When more than one synthetic data set has been generated (object$m>1
)
combining rules are applied. Analysisspecific utility measures are used to
evaluate differences between synthetic and observed data.
## S3 method for class 'fit.synds' compare(object, data, plot = "Z", print.coef = FALSE, return.plot = TRUE, plot.intercept = FALSE, lwd = 1, lty = 1, lcol = c("#1A3C5A","#4187BF"), dodge.height = .5, point.size = 2.5, population.inference = FALSE, ci.level = 0.95, ...) ## S3 method for class 'compare.fit.synds' print(x, print.coef = x$print.coef, ...)
object 
an object of type 
data 
an original observed data set. 
plot 
values to be plotted: 
print.coef 
a logical value determining whether tables of estimates for the original and synthetic data should be printed. 
return.plot 
a logical value indicating whether a confidence interval plot should be returned. 
plot.intercept 
a logical value indicating whether estimates for intercept should be plotted. 
lwd 
the line type. 
lty 
the line width. 
lcol 
line colours. 
dodge.height 
size of vertical shifts for confidence intervals to prevent overlaping. 
point.size 
size of plotting symbols used to plot point estimates of coefficients. 
population.inference 
a logical value indicating whether intervals for inference to population quantities, as decribed by Karr et al. (2006), should be calculated and plotted. This option suppresses the lackoffit test and the standardised differences since these are based on differences standardised by the original interval widths. 
ci.level 
Confidence interval coverage as a proportion. 
... 
additional parameters passed to 
x 
an object of class 
This function can be used to evaluate whether the method used for
synthesis is appropriate for the fitted model. If this is the case the
estimates from the synthetic dataof what would be expected from the original
data xpct(Beta)
xpct(Z)
should not differ from the estimates from
the observed data (Beta
and Z
) by more than would be expected from
the standard errors (se(Beta)
and se(Z)
). For more details see the
vignette on inference.
An object of class compare.fit.synds
which is a list with the
following components:
call 
the original call to fit the model to the synthesised data set. 
coef.obs 
a data frame including estimates based on the observed
data: coefficients ( 
coef.syn 
a data frame including (combined) estimates based on
the synthesised data: point estimates of observed data coefficients
( 
coef.diff 
a data frame containing standardized differences between the coefficients estimated from the original data and those calculated from the combined synthetic data. The difference is standardized by dividing by the estimated standard error of the fit from the original. The corresponding pvalues are calculated from a standard Normal distribution and represent the probability of achieving differences as large as those found if the model use for synthesis is compatible with the model that generated the original data. 
mean.abs.std.diff 
Mean absolute standardized difference (over all coefficients). 
ci.overlap 
a data frame containing the percentage of overlap between
the estimated synthetic confidence intervals and the original sample
confidence intervals for each parameter. When 
mean.ci.overlap 
Mean confidence interval overlap (over all coefficients). 
lack.of.fit 
lackoffit measure from all 
lof.pvalue 
pvalue for the combined lackoffit test of the NULL hypothesis that the method used for synthesis retains all relationships between variables that influence the parameters of the fit. 
ci.plot 

print.coef 
a logical value determining whether tables of estimates for the original and synthetic data should be printed. 
m 
the number of synthetic versions of the original (observed) data. 
ncoef 
the number of coefficients in the fitted model (including an intercept). 
incomplete 
whether methods for incomplete synthesis due to Reiter (2003) have been used in calculations. 
population.inference 
whether intervals as decribed by Karr et al. (2016) have been calculated. 
Karr, A., Kohnen, C.N., Oganian, A., Reiter, J.P. and Sanil, A.P. (2006). A framework for evaluating the utility of data altered to protect confidentiality. The American Statistician, 60(3), 224232.
Nowok, B., Raab, G.M and Dibben, C. (2016). synthpop: Bespoke creation of synthetic data in R. Journal of Statistical Software, 74(11), 126. doi: 10.18637/jss.v074.i11.
Reiter, J.P. (2003) Inference for partially synthetic, public use microdata sets. Survey Methodology, 29, 181188.
summary.fit.synds
ods < SD2011[,c("sex","age","edu","smoke")] s1 < syn(ods, m = 3) f1 < glm.synds(smoke ~ sex + age + edu, data = s1, family = "binomial") compare(f1, ods) compare(f1, ods, print.coef = TRUE, plot = "coef")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.