PrivReg: Private regression with vertically partitioned data

PrivRegR Documentation

Private regression with vertically partitioned data

Description

Perform privacy-preserving regression modeling across different institutions. This class implements regression with gaussian and binomial responses using block coordinate descent.

Value

an R6 object of class PrivReg

Usage

alice <- PrivReg$new(
  formula,
  data,
  family    = "gaussian",
  name      = "alice",
  verbose   = FALSE,
  debug     = FALSE,
  crypt_key = "testkey"
)

alice$listen()
alice$connect(127.0.0.1)
alice$disconnect()

alice$estimate()
alice$calculate_se()

alice$summary()
alice$coef()
alice$converged()
alice$plot_paths()
alice$elapsed()

Arguments

  • formula model formula for the regression model at this institution

  • data data frame for the variables in the model formula

  • family response family as in glm. Currently only gaussian and binomial are supported

  • intercept whether to include the intercept. Always use this instead of + 0 in the model formula

  • name name of this institution

  • verbose whether to print information

  • debug whether to print debug statements

  • crypt_key pre-shared key used to encrypt communication

Details

  • $new() instantiates and returns a new PrivReg object.

  • $listen() listens for incoming connections from a partner institution

  • $connect() connects to a listening partner institution

  • $disconnect() disconnects from the partner institution

  • $set_control() sets control parameters. See below for more info

  • $estimate() computes parameter estimates through block coordinate descent

  • $calculate_se() computes standard errors using projection method

  • $converged() test whether the algorithm has converged

  • $summary() displays a summary of the object, invisibly returns the coef matrix

  • $coef() returns the model coefficients

  • $plot_paths() plots the paths of the parameters over the estimation iterations

  • $elapsed() print information about the elapsed time

Control parameters

  • max_iter maximum number of iterations of the coordinate descent algorithm

  • tol PrivReg is converged if all beta changes are below tol.

  • se Whether to compute standard errors

Examples

## Not run: 
# generate some data
set.seed(45)
X <- matrix(rnorm(1000), 100)
b <- runif(10, -1, 1)
y <- X %*% b + rnorm(100, sd = sqrt(b %*% S %*% b))

# split into alice and bob institutions
alice_data <- data.frame(y, X[, 1:5])
bob_data   <- data.frame(y, X[, 6:10])

# create connection
alice$listen()
bob$connect("127.0.0.1") # if alice is on different computer, change ip

# estimate
alice$estimate()

# ...

# compare results to lm()
summary(lm(y ~ X + 0))
alice$summary()
bob$summary()

## End(Not run)


vankesteren/privreg documentation built on March 4, 2024, 10:47 a.m.