Produce tables from observed and synthesized data and calculates utility measures to compare them with their expectation if the synthesising model is correct.

1 2 3 4 5 6 7 8 | ```
utility.tab(object, data, vars = NULL, ngroups = 5, useNA = TRUE,
print.tables = length(vars) < 4, print.stats = 'VW',
print.zdiff = FALSE, digits = 2, ...)
## S3 method for class 'utility.tab'
print(x, print.tables = x$print.tables,
print.zdiff = x$print.zdiff, print.stats = x$print.stats,
digits = x$digits, ...)
``` |

`object` |
an object of class |

`data` |
the original (observed) data set. |

`vars` |
a single string or a vector of strings with the names of variables to be used to form the table. |

`ngroups` |
if numerical (non-factor) variables are included they will be
classified into this number of groups to form tables. Classification is
performed using |

`useNA` |
determines if NA values are to be included in tables. |

`print.tables` |
a logical value that determines if tables of observed and synthesised are to be printed. |

`print.stats` |
Determines which chi-squred statistics to print to compare the observed and synthetic tables : 'VW' for Voas Williams, 'FT' for Freeman Tukey or c('VW','FT') for both. |

`print.zdiff` |
a logical value that determines if tables of Z scores for differences between observed and expected are to be printed. |

`digits` |
an integer indicating the number of decimal places
for printing statistics, |

`...` |
additional parameters; can be passed to classIntervals() function. |

`x` |
an object of class |

Forms tables of observed and synthesised values for the variables
specified in `vars`

. Two utility measures are calculated from the cells
of the tables, a measure of fit proposed by Voas and Williams
`sum((observed-synthesied)^2/[(observed + synthesised)/2)])`

and one
proposed by Freeman and Tukey `4*sum((observed^(0.5)-synthesised^(0.5))^2))`

.
In both cases those cells where observed and synthesised are both zero do not
contribute to the sum. If the synthesising model is correct both of these
measures should have chi-square distributions for large samples.

An object of class `utility.tab`

which is a list with the following
components:

`m` |
number of synthetic data sets in object, i.e. |

`tab.obs` |
a table from the observed data. |

`UtabFT` |
a vector with |

`UtabVW` |
a vector with |

`df` |
a vector of degrees of freedom for the chi-square tests which equal to one minus the number of cells in the table with any observed or synthesised counts. |

`ratioFT` |
a vector with ratios of |

`ratioVW` |
a vector with ratios of |

`pvalFT` |
a vector with |

`pvalVW` |
a vector with |

`nempty` |
a vector of length |

`tab.obs` |
a table from the observed data. |

`tab.syn` |
a table or a list of |

`tab.zdiff` |
a table or a list of |

`n` |
number of observation in the original dataset. |

Nowok, B., Raab, G.M and Dibben, C. (2016). synthpop: Bespoke
creation of synthetic data in R. *Journal of Statistical Software*,
**74**(11), 1-26. doi: 10.18637/jss.v074.i11.

Read, T.R.C. and Cressie, N.A.C. (1988) *Goodness–of–Fit Statistics for
Discrete Multivariate Data*, Springer–Verlag, New York.

Voas, D. and Williamson, P. (2001) Evaluating goodness-of-fit measures for
synthetic microdata. *Geographical and Environmental Modelling*,
**5**(2), 177-200.

1 2 3 4 5 6 7 8 9 | ```
ods <- SD2011[1:1000, c("sex", "age", "edu", "marital")]
s1 <- syn(ods, m = 10)
utility.tab(s1, ods, vars = c("marital", "sex"))
s2 <- syn(ods, m = 1)
utility.tab(s2, ods, vars = c("marital", "age"), ngroups = 3, print.tables = TRUE)
u2 <- utility.tab(s2, ods, vars = c("marital", "age"), style = "pretty")
print(u2, print.tables = TRUE, print.zdiff = TRUE)
``` |

