knitr::opts_knit$set( stop_on_error = 2L ) knitr::opts_chunk$set( fig.height = 7, fig.width = 7 )
dpMean
The dpMean
class evaluates a privacy-preserving mean of a vector of values. The class supports any vector type that can be represented numerically, meaning that it can handle the R types numeric
, integer
, and logical
.
library(PSIlence) x1 <- c(3, 12, 20, 42, 33, 65, 70, 54, 33, 45) x2 <- c(TRUE, FALSE, FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, FALSE, TRUE) data <- data.frame(x1, x2) dpMeanExample <- dpMean$new(mechanism='mechanismLaplace', varType='numeric', variable='x1', epsilon=0.1, n=10, rng=c(0, 70)) out <- dpMeanExample$release(data) dpMeanExample2 <- dpMean$new(mechanism='mechanismLaplace', varType='logical', variable='x2', epsilon=0.1, n=10) out2 <- dpMeanExample2$release(data)
In typical usage, there are two methods to the dpGLM
class. The new
method, which creates an object of the class, accepts the following arguments.
mechanism
\ Character, the class name of the mechanism used to perturb the true estimate, one of 'mechanismLaplace'
or 'mechanismBootstrap'
.
varType
\ Character, the type of values in the data frame that will be passed to the mechanism. Should be one of 'numeric'
, 'integer'
, or 'logical'
.
n
\ Integer, the number of observations in the vector.
rng
\ Numeric, a 2-tuple giving an a priori estimate of the lower and upper bounds of the vector.
epsilon
\ Numeric, the differential privacy parameter $\epsilon$, typically taking values between 0 and 1 and reflecting the privacy cost of the query. Optional, default NULL
. If NULL
, the user must specify a value for accuracy
.
accuracy
\ Numeric, the accuracy of the query. Optional, default NULL
. If NULL
, the user must specify a value for epsilon
. If epsilon
is not NULL
, this value is ignored and evaluated internally.
imputeRng
\ Numeric, a 2-tuple giving a range within which missing values of the vector are imputed. Optional, default NULL
. If NULL
, missing values are imputed using the range provided in rng
.
nBoot
\ Integer, the number of bootstrap replications to perform. Optional, default NULL
. If not NULL
, the privacy cost epsilon
is partitioned across nBoot
replications and the estimates for each are returned.
alpha
\ Numeric, the statistical significance level used in evaluating accuracy and privacy parameters. If the bootstrap is employed, alpha
is also used to trim the release. Default 0.05
.
The release
method accepts a single argument.
x
\ Data frame containing numeric columns corresponding the names specified in formula
.Attach the sample dataset.
library(PSIlence) data(PUMS5extract10000)
Calculate a private mean of a numeric vector with dpMean
using the Laplace mechanism:
numericMean <- dpMean$new(mechanism='mechanismLaplace', varType='numeric', variable='income', n=10000, epsilon=0.1, rng=c(0, 750000)) numericMean$release(PUMS5extract10000) print(numericMean$result)
To calculate the mean of a logical vector instead, input a logical vector into x
and update varType
and rng
appropriately:
logicalMean <- dpMean$new(mechanism='mechanismLaplace', varType='logical', variable='married', n=10000, epsilon=0.1, rng=c(0, 1)) logicalMean$release(PUMS5extract10000) print(logicalMean$result)
The release
method makes a call to the mechanism, which generates a list of statistical summaries available on the result
field.
result
List, contains the accuracy guarantee, privacy cost, and private release. Other elements reflecting variable post-processing of the release.
The list in the result
attribute has the following values.
release
\ Differentially private estimate of the mean. If the bootstrap mechanism is used, one estimate for each bootstrap replication is provided (i.e., vector of length nBoot
).accuracy
\ The accuracy guarantee of the release given epsilon
.epsilon
\ The privacy cost required to guarantee accuracy
.interval
\ Confidence interval of the private estimate given accuracy
.std.dev
\ The standard deviation of the vector. Only available for logical vectors.median
\ The median of the vector. Only available for logical vectors.histogram
\ The histogram of the vector. Only available for logical vectors.std.error
\ Estimates of the standard error of the mean. Only available when the bootstrap mechanism is used.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.