histogramGetAccuracy: Histogram accuracy

Description Usage Arguments Details Value References

View source: R/utilities-histogram.R

Description

Determine accuracy of histogram release, given epsilon and delta, for the differentially private histogram release.

Usage

1
2
histogramGetAccuracy(mechanism, epsilon, delta = 2^-30, alpha = 0.05,
  sensitivity)

Arguments

mechanism

A string indicating the mechanism that will be used to construct the histogram

epsilon

A numeric vector representing the epsilon privacy parameter. Should be of length one and should be between zero and one.

delta

The probability of an arbitrary leakage of information from the data. Should be of length one and should be a very small value. Default to 10^-6.

alpha

A numeric vector of length one specifying the numeric statistical significance level. Default to 0.05.

Details

In differential privacy, "accuracy" is defined as the threshold value above which a given value is "significantly different" from the expected value. Mathematically, this is written as:

α = Pr[Y > a]

Where α is the statistical significance level, a is the accuracy, and Y is a random variable indicating the difference between the differentially private noisy output and the true value. This equation is saying that with probability 1-α, the count of a histogram bin will be within a of the true count.

The equation for Y is:

Y = |X - μ|

Where μ is the true value of a bin and X is the noisy count. X follows a Laplace distribution centered at μ. Subtracting mu centers Y at 0, and taking the absolute value "folds" the Lapalce distribution. The absolute value is taken because the difference between noisy and true outputs is measured in magnitude.

Deriving the accuracy formula:

  1. The probability density function (PDF) f(x) of the Laplace distribution is:
    f(x) = {1 / 2λ} * e^{-|x-μ| / λ}

  2. Using the definition of Y above, we can consider the differentially private PDF g(Y) to be:
    g(y) = {1 / λ} * e^{-y / λ}

  3. Using α = Pr[Y > a] and the PDF, we can solve for a and plug in λ = 2 / ε, and end up with the accuracy formula:

    a = {2 / ε} * ln(1 / α)

  4. The accuracy formula for the stability mechanism is derived by adding the accuracy formula above to the accuracy threshold (which is the worst-case potentially added noise in the stability mechanism): {2 / ε} * ln(2 / δ)+1

Value

Accuracy guarantee for histogram release, given epsilon.

References

S. Vadhan The Complexity of Differential Privacy, Section 3.3 Releasing Stable Values p.23-24. March 2017.


IQSS/PSI-Library documentation built on Feb. 15, 2020, 9:03 p.m.