rankSwap | R Documentation |
Swapping values within a range so that, first, the correlation structure of original variables are preserved, and second, the values in each record are disturbed. To be used on numeric or ordinal variables where the rank can be determined and the correlation coefficient makes sense.
rankSwap(
obj,
variables = NULL,
TopPercent = 5,
BottomPercent = 5,
K0 = NULL,
R0 = NULL,
P = NULL,
missing = NA,
seed = NULL
)
obj |
a |
variables |
names or index of variables for that rank swapping is
applied. For an object of class |
TopPercent |
Percentage of largest values that are grouped together before rank swapping is applied. |
BottomPercent |
Percentage of lowest values that are grouped together before rank swapping is applied. |
K0 |
Subset-mean preservation factor. Preserves the means before and
after rank swapping within a range based on K0. K0 is the subset-mean
preservation factor such that |
R0 |
Multivariate preservation factor. Preserves the correlation
between variables within a certain range based on the given constant R0. We
can specify the preservation factor as |
P |
Rank range as percentage of total sample size. We can specify the
rank range itself directly, noted as |
missing |
missing - the value to be used as missing value in the C++ routine instead of NA. If NA, a suitable value is calculated internally. Note that in the returned dataset, all NA-values (if any) will be replaced with this value. |
seed |
Seed. |
Rank swapping sorts the values of one numeric variable by their numerical values (ranking). The restricted range is determined by the rank of two swapped values, which cannot differ, by definition, by more than P percent of the total number of observations. Only positive P, R0 and K0 are used and only one of it must be supplied. If none is supplied, sdcMicro sets parameter r0 to 0.95 internally.
The rank-swapped data set or a modified sdcMicroObj-class
object.
Alexander Kowarik for the interface, Bernhard Meindl for improvements.
For the underlying C++ code: This work is being supported by the International Household Survey Network (IHSN) and funded by a DGF Grant provided by the World Bank to the PARIS21 Secretariat at the Organisation for Economic Co-operation and Development (OECD). This work builds on previous work which is elsewhere acknowledged.
Moore, Jr.R. (1996) Controlled data-swapping techniques for masking public use microdata, U.S. Bureau of the Census Statistical Research Division Report Series, RR 96-04.
Kowarik, A. and Templ, M. and Meindl, B. and Fonteneau, F. and Prantner, B.: Testing of IHSN Cpp Code and Inclusion of New Methods into sdcMicro, in: Lecture Notes in Computer Science, J. Domingo-Ferrer, I. Tinnirello (editors.); Springer, Berlin, 2012, ISBN: 978-3-642-33626-3, pp. 63-77. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-3-642-33627-0_6")}
data(testdata2)
data_swap <- rankSwap(
obj = testdata2,
variables = c("age", "income", "expend", "savings")
)
## for objects of class sdcMicro:
data(testdata2)
sdc <- createSdcObj(
dat = testdata2,
keyVars = c("urbrur", "roof", "walls", "water", "electcon", "relat", "sex"),
numVars = c("expend", "income", "savings"),
w = "sampling_weight")
sdc <- rankSwap(sdc)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.