create.fused | R Documentation |
Creates a synthetic data frame after the statistical matching of two data sources at micro level.
create.fused(data.rec, data.don, mtc.ids,
z.vars, dup.x=FALSE, match.vars=NULL)
data.rec |
A matrix or data frame that plays the role of recipient in the statistical matching application. |
data.don |
A matrix or data frame that plays the role of donor in the statistical matching application. |
mtc.ids |
A matrix with two columns. Each row must contain the name or the index of the recipient record (row) in |
z.vars |
A character vector with the names of the variables available only in |
dup.x |
Logical. When |
match.vars |
A character vector with the names of the matching variables. It has to be specified only when |
This function allows to create the synthetic (or fused) data set after the application of a statistical matching in a micro framework. For details see D'Orazio et al. (2006).
The data frame data.rec
with the z.vars
filled in and, when dup.x=TRUE
, with the values of the matching variables match.vars
observed on the donor records.
Marcello D'Orazio mdo.statmatch@gmail.com
D'Orazio, M., Di Zio, M. and Scanu, M. (2006). Statistical Matching: Theory and Practice. Wiley, Chichester.
NND.hotdeck
RANDwNND.hotdeck
rankNND.hotdeck
lab <- c(1:15, 51:65, 101:115)
iris.rec <- iris[lab, c(1:3,5)] # recipient data.frame
iris.don <- iris[-lab, c(1:2,4:5)] # donor data.frame
# Now iris.rec and iris.don have the variables
# "Sepal.Length", "Sepal.Width" and "Species"
# in common.
# "Petal.Length" is available only in iris.rec
# "Petal.Width" is available only in iris.don
# find the closest donors using NND hot deck;
# distances are computed on "Sepal.Length" and "Sepal.Width"
out.NND <- NND.hotdeck(data.rec=iris.rec, data.don=iris.don,
match.vars=c("Sepal.Length", "Sepal.Width"),
don.class="Species")
# create synthetic data.set, without the
# duplication of the matching variables
fused.0 <- create.fused(data.rec=iris.rec, data.don=iris.don,
mtc.ids=out.NND$mtc.ids, z.vars="Petal.Width")
# create synthetic data.set, with the "duplication"
# of the matching variables
fused.1 <- create.fused(data.rec=iris.rec, data.don=iris.don,
mtc.ids=out.NND$mtc.ids, z.vars="Petal.Width",
dup.x=TRUE, match.vars=c("Sepal.Length", "Sepal.Width"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.