knitr::opts_knit$set(
        stop_on_error = 2L
)
knitr::opts_chunk$set(
    fig.height = 7,
    fig.width = 7
)

Differentially Private Covariance with dpCovariance

The dpCovariance class evaluates a privacy-preserving covariance of a series of vectors of values. The class supports any vector type that can be represented numerically, meaning that it can handle the R types numeric and integer.

Syntax

# import the library
library(PSIlence)

# example data
x1 <- c(3, 12, 20, 42, 33, 65, 70, 54, 33, 45)
x2 <- c(11, 42, 16, 20, 21, 86, 30, 50, 73, 94)
data <- data.frame(x1, x2)

# range of the example data
# we do this here to make the ranges easier to pass into the call to dpCovariance
range1 <- c(0,70)
range2 <- c(0,100)
ranges <- list(range1, range2)

dpCovarianceExample <- dpCovariance$new(mechanism='mechanismLaplace', varType='numeric',
                                         n=10, epsilon=c(1,1,1), columns = c('x1', 'x2'), rng=ranges)
dpCovarianceExample$release(data)
print(dpCovarianceExample$result)

Arguments

In typical usage, there are two methods to the dpCovariance class: the new method and the release method. The new method does not touch any data, it just creates an object that can calculate a differentially private covariance matrix. Only the release method touches data, and applies the functionality of the previously created object to the data.

The new method creates an object of the class, and accepts the following arguments:

 

The release method accepts a single argument.

Values

The release method makes a call to the mechanism, which generates a list of statistical summaries available on the result field.

 

The list in the result attribute has the following values.

Examples

Import the PSIlence library and attach the sample datasets:

library(PSIlence)
data(PUMS5extract10000)

 

To calculate a private covariance matrix of a set of numeric vectors with dpCovariance, enter the mechanism (this will be the Laplace Mechanism, or 'mechanismLaplace'), the variable type ('numeric'), the columns of interest (the column names of the variables of interest in the dataframe), the number of observations in the dataframe, the epsilon value (generally less than 1), and the matrix of ranges of the chosen columns:

# before calculating the covariance, create a matrix of the column ranges,
# where each row contains the minimum and maximum value of a column,
# in the order of the columns as they are passed into the dpCovariance call
income_range = c(0, 750000)
education_range = c(1,16)
age_range = c(0,120)
ranges <- list(income_range, education_range, age_range)

numeric_covariance <- dpCovariance$new(mechanism='mechanismLaplace', varType='numeric',
                           columns=c('income', 'educ', 'age'), n=10000, epsilon=c(0.1,0.1,0.1,0.1,0.1,0.1), rng=ranges)
numeric_covariance$release(PUMS5extract10000)
print(numeric_covariance$result)

Notes



privacytoolsproject/PSI-Library documentation built on Feb. 17, 2020, 2:03 p.m.