Description Usage Arguments Details Value Author(s) References Examples
Use the sample-moment-based density approximant method to estimate the density function of univariate distributions based noise multiplied data.
1 | unmask(maskedVectorToBeUnmasked, noisefile)
|
maskedVectorToBeUnmasked |
masked data. The masked data were generated by R Function mask. |
noisefile |
Noise file containing a sample of the noise used to mask maskedVectorToBeUnmasked from R function mask |
unmask is fully described in Lin and Fielding (2015). The theory used to support unmask can be found in Lin (2014). unmask implements the sample-moment-based density approximate method the estimated the smoothed density function of the original data based on their make data maskedVectorToBeUnmasked. The output of the function unmask is a set of sample data from the estimated mouthed density function. The size of the output is the same as that of the original data that were masked by the multiplicative noise and yielded maskedVectorToBeUnmasked.
Returns a list with four elements.
unmaskedVariable |
vector of unmasked data |
outMeanOfNoise |
sample mean of the noise |
outMeanOfSquaredNoise |
sample mean of the squared noise |
prob |
vector mass function returned if the original data are categorical |
Yan-Xia Lin
Lin, Yan-Xia (2014). Density approximant based on noise multiplied data. In J. Domingo-Ferrer (Eds.), Privacy in Statistical Databases 2014, LNCS 8744, Springer International Publishing Switzerland, 2014, pp. 89-104. Lin, Yan-Xia and Fielding, Mark James (2015). MaskDensity14: An R Package for the Density Approximant of a Univariate Based on Noise Multiplied Data, SoftwareX 34, 3743, doi:10.1016/j.softx.2015.11.002
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | ##---- Should be DIRECTLY executable !! ----
##-- ==> Define data, use random,
##-- or do help(data=index) for the standard data sets.
#Example 1:
set.seed(123)
n=10000
y <- rmulti(n=10000, mean=c(30, 50), sd=c(4,2), p=c(0.3, 0.7))
# y is a sample drawn from Y.
noise<-rmulti(n=10000, mean=c(80, 100), sd=c(5,3), p=c(0.6, 0.4))
# noise is a sample drawn from C.
a1<-runif(1, min=min(y)-2,max=min(y))
b1<-runif(1, min=max(y), max=max(y)+2)
ymask<-mask(vectorToBeMasked=y, noisefile=file.path(tempdir(),"noise.bin"), noise,
lowerBoundAsGivenByProvider=a1, upperBoundAsGivenByProvider=b1)
write(ymask$ystar, file.path(tempdir(),"ystar.dat")) # Create masked data and noise.bin.
# The two files can be issued to the public.
# After received the two files "ystar.dat" and
# noise.bin, the data user can use the following code to
# obtain the synthetic data of the original data.
ystar <- scan(file.path(tempdir(),"ystar.dat"))
y1 <- unmask(maskedVectorToBeUnmasked=ystar, noisefile=file.path(tempdir(),"noise.bin"))
sample<-y1$unmaskedVariable
# y1$unmaskedVariable gives the synthetic data of the
# original data y. The size of the synthetic data is the
# same as that of y
plot(density(y1$unmaskedVariable), main="density(ymask)", xlab="y")
# the plot of the approximant of $f_Y$
#Example 2:
set.seed(124)
n<-2000
a<-170
b<-80
y<-rbinom(n, 1, 0.1)+1
noise<-(a+b)/2+ sqrt(1+(a-b)^2/4)*rnorm(n, 0,1)
noise[noise<0]<- - noise[noise<0]
ymask<-mask(vectorToBeMasked=factor(y), noisefile=file.path(tempdir(),"noise.bin"), noise,
lowerBoundAsGivenByProvider=0,upperBoundAsGivenByProvider=3)
# using factor(y) because y is a categorical variable
write(ymask$ystar, file.path(tempdir(),"ystar.dat"))
ystar<-scan(file.path(tempdir(),"ystar.dat"))
y1 <- unmask(maskedVectorToBeUnmasked=ystar, noisefile=file.path(tempdir(),"noise.bin"))
unmaskY<-y1$unmaskedVariable # synthetic data
mass_function<-y1$prob # estimated mass function
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.