logisticSVD: Logistic Singular Value Decomposition

Description Usage Arguments Value References Examples

View source: R/logisticSVD.R

Description

Dimensionality reduction for binary data by extending SVD to minimize binomial deviance.

Usage

1
2
3
logisticSVD(x, k = 2, quiet = TRUE, max_iters = 1000,
  conv_criteria = 1e-05, random_start = FALSE, start_A, start_B, start_mu,
  partial_decomp = TRUE, main_effects = TRUE, use_irlba)

Arguments

x

matrix with all binary entries

k

rank of the SVD

quiet

logical; whether the calculation should give feedback

max_iters

number of maximum iterations

conv_criteria

convergence criteria. The difference between average deviance in successive iterations

random_start

logical; whether to randomly inititalize the parameters. If FALSE, algorithm will use an SVD as starting value

start_A

starting value for the left singular vectors

start_B

starting value for the right singular vectors

start_mu

starting value for mu. Only used if main_effects = TRUE

partial_decomp

logical; if TRUE, the function uses the rARPACK package to more quickly calculate the SVD. When the number of columns is small, the approximation may be less accurate and slower

main_effects

logical; whether to include main effects in the model

use_irlba

depricated. Use partial_decomp instead

Value

An S3 object of class lsvd which is a list with the following components:

mu

the main effects

A

a k-dimentional orthogonal matrix with the scaled left singular vectors

B

a k-dimentional orthonormal matrix with the right singular vectors

iters

number of iterations required for convergence

loss_trace

the trace of the average negative log likelihood of the algorithm. Should be non-increasing

prop_deviance_expl

the proportion of deviance explained by this model. If main_effects = TRUE, the null model is just the main effects, otherwise the null model estimates 0 for all natural parameters.

References

de Leeuw, Jan, 2006. Principal component analysis of binary data by iterated singular value decomposition. Computational Statistics & Data Analysis 50 (1), 21–39.

Collins, M., Dasgupta, S., & Schapire, R. E., 2001. A generalization of principal components analysis to the exponential family. In NIPS, 617–624.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# construct a low rank matrix in the logit scale
rows = 100
cols = 10
set.seed(1)
mat_logit = outer(rnorm(rows), rnorm(cols))

# generate a binary matrix
mat = (matrix(runif(rows * cols), rows, cols) <= inv.logit.mat(mat_logit)) * 1.0

# run logistic SVD on it
lsvd = logisticSVD(mat, k = 1, main_effects = FALSE, partial_decomp = FALSE)

# Logistic SVD likely does a better job finding latent features
# than standard SVD
plot(svd(mat_logit)$u[, 1], lsvd$A[, 1])
plot(svd(mat_logit)$u[, 1], svd(mat)$u[, 1])

Example output



logisticPCA documentation built on May 1, 2019, 10:12 p.m.