View source: R/utility.tables.r
utility.tables  R Documentation 
Calculates and plots tables of utility measures. The calculations of
utility measures are done by the function utility.tab
.
Options are all oneway tables, all twoway tables or threeway tables
for a specified third variable along with pairs of all other variables.
This function can be also used with synthetic data NOT created by
syn()
, but then an additional parameters not.synthesised
and cont.na
might need to be provided.
## S3 method for class 'synds' utility.tables(object, data, tables = "twoway", maxtables = 5e4, vars = NULL, third.var = NULL, useNA = TRUE, ngroups = 5, tab.stats = c("pMSE", "S_pMSE", "df"), plot.stat = "S_pMSE", plot = TRUE, print.tabs = FALSE, digits.tabs = 4, max.scale = NULL, min.scale = 0, plot.title = NULL, nworst = 5, ntabstoprint = 0, k.syn = FALSE, low = "grey92", high = "#E41A1C", n.breaks = NULL, breaks = NULL, ...) ## S3 method for class 'data.frame' utility.tables(object, data, cont.na = NULL, not.synthesised = NULL, tables = "twoway", maxtables = 5e4, vars = NULL, third.var = NULL, useNA = TRUE, ngroups = 5, tab.stats = c("pMSE", "S_pMSE", "df"), plot.stat = "S_pMSE", plot = TRUE, print.tabs = FALSE, digits.tabs = 4, max.scale = NULL, min.scale = 0, plot.title = NULL, nworst = 5, ntabstoprint = 0, k.syn = FALSE, low = "grey92", high = "#E41A1C", n.breaks = NULL, breaks = NULL, ...) ## S3 method for class 'list' utility.tables(object, data, cont.na = NULL, not.synthesised = NULL, tables = "twoway", maxtables = 5e4, vars = NULL, third.var = NULL, useNA = TRUE, ngroups = 5, tab.stats = c("pMSE", "S_pMSE", "df"), plot.stat = "S_pMSE", plot = TRUE, print.tabs = FALSE, digits.tabs = 4, max.scale = NULL, min.scale = 0, plot.title = NULL, nworst = 5, ntabstoprint = 0, k.syn = FALSE, low = "grey92", high = "#E41A1C", n.breaks = NULL, breaks = NULL, ...) ## S3 method for class 'utility.tables' print(x, print.tabs = NULL, digits.tabs = NULL, plot = NULL, plot.title = NULL, max.scale = NULL, min.scale = NULL, nworst = NULL, ntabstoprint = NULL, ...)
object 
an object of class 
data 
the original (observed) data set. 
cont.na 
a named list of codes for missing values for continuous
variables if different from the 
not.synthesised 
a vector of variable names for any variables that has been left unchanged in the synthetic data. 
tables 
defines the type of tables to produce. Options are

maxtables 
maximum number of tables that will be produced. If number of
tables is larger, then utility is only measured for a sample of size

.
vars 
a vector of strings with the names of variables to be used to form the table, or a vector of variable numbers in the original data. Defaults to all variables in both original and synthetic data. 
third.var 
when 
useNA 
determines if 
ngroups 
if numerical (nonfactor) variables included with

tab.stats 
statistics to include in the table of results. Must be
a selection from: 
plot.stat 
statistics to plot. Choice is 
plot 
determines if plot will be produced when the result is printed. 
print.tabs 
logical value that determines if table of results is to be printed. 
digits.tabs 
number of digits to print for table, except for pvalues that are always printed to 4 places. 
max.scale 
a numeric value for the maximum value used in calculating
the shading of the plots. If it is 
min.scale 
a numeric value for the minimum value used in calculating
the shading of the plots. If it is 
plot.title 
title for the plot. 
nworst 
a number of variable combinations with worst utility scores to be printed. 
ntabstoprint 
a number of tables to print for observed and synthetic data with the worst utility. 
k.syn 
a logical indicator as to whether the sample size itself has been synthesised. 
low 
colour for low end of the gradient. 
high 
colour for high end of the gradient. 
n.breaks 
a number of break points to create if breaks are not given directly. 
breaks 
breaks for a two colour binned gradient. 
... 
additional parameters 
x 
an object of class 
Calculates tables of observed and synthesised values for the variables
specified in vars
with the function utility.tab
and produces
tables and plots of oneway, twoway or
threeway utility measures formed from vars
. Several options for utility
measures can be selected for printing or plotting. Details are in help file
for utility.tab
.
The tables and variables with the worst utility scores are identified. Visualisations of the matrices of utility scores are plotted. For threeway tables a third variable can be defined to select all tables involving that variable for plotting. If it is not specified the variable with tables giving the worst utility is selected as the third variable.
An object of class utility.tab
which is a list with the following
components:
tabs 
a table with all the selected measures for all combinations of
variables defined by 
plot.stat 
measure used in 
tables 
see above. 
third.var 
see above. 
utility.plot 
plot of the selected utility measure. 
var.scores 
an average of utility scores for all combinations with other variables. 
plot 
see above. 
print.tabs 
see above. 
digits.tabs 
see above. 
plot.title 
see above. 
max.scale 
see above. 
min.scale 
see above. 
ntabstoprint 
see above. 
nworst 
see above. 
worstn 
variable combinations with 
worsttabs 
observed and synthetic crosstabulations for 
Read, T.R.C. and Cressie, N.A.C. (1988) Goodness–of–Fit Statistics for Discrete Multivariate Data, Springer–Verlag, New York.
Voas, D. and Williamson, P. (2001) Evaluating goodnessoffit measures for synthetic microdata. Geographical and Environmental Modelling, 5(2), 177200.
utility.tab
ods < SD2011[1:1000, c("sex", "age", "edu", "marital", "region", "income")] s1 < syn(ods) ### synthetic data provided as a 'synds' object (t1 < utility.tables(s1, ods, tab.stats = "all", print.tabs = TRUE)) ### synthetic data provided as a 'data.frame' object (t1 < utility.tables(s1$syn, ods, tab.stats = "all", print.tabs = TRUE)) t2 < utility.tables(s1, ods, tables = "twoway") print(t2, max.scale = 3) (t3 < utility.tables(s1, ods, tab.stats = "all", tables = "threeway", third.var = "sex", print.tabs = TRUE)) (t4 < utility.tables(s1, ods, tab.stats = "all", tables = "threeway", third.var = "sex", useNA = FALSE, print.tabs = TRUE)) (t5 < utility.tables(s1, ods, tab.stats = "all", print.tabs = TRUE))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.