genSupMF: Supervised matrix factorization for exponential family data

Description Usage Arguments Value References Examples

View source: R/gen_sup_mf.R

Description

Supervised matrix factorization for exponential family data

Usage

1
2
3
4
5
genSupMF(x, y, k = 2, alpha = NULL, family_x = c("gaussian", "binomial",
  "poisson"), family_y = c("gaussian", "binomial", "poisson"), quiet = TRUE,
  max_iters = 1000, conv_criteria = 1e-05, random_start = FALSE, start_A,
  start_B, start_beta, mu, update_A = TRUE, update_beta = TRUE,
  lambda = 0.01)

Arguments

x

covariate matrix

y

response vector

k

dimension

alpha

balance between dimensionality reduction of x and prediction of y

family_x

exponential family distribution of covariates

family_y

exponential family distribution of response

quiet

logical; whether the calculation should give feedback

max_iters

maximum number of iterations

conv_criteria

convergence criteria

random_start

whether to randomly initialize A and B

start_A

initial value for A

start_B

initial value for B

start_beta

initial value for beta. Mainly effective if update_beta is TRUE

mu

specific value for mu, the mean vector of x

update_A

logical; whether to update the scores A

update_beta

logical; whether to update the coefficients beta

lambda

an L2 penalty on the parameters to prevent numerical issues. This was reportedly used in Rish's implementation

Value

An S3 object of class gsmf which is a list with the following components:

mu

the main effects for dimensionality reduction

A

the nxk-dimentional matrix with the scores

B

the dxk-dimentional matrix with the loadings

beta

the k + 1 length vector of the coefficients

family_x

the exponential family of covariates

family_y

the exponential family of response

iters

number of iterations required for convergence

loss_trace

the trace of the average negative log likelihood of the algorithm. Should be non-increasing

References

Rish, Irina, et al. "Closed-form supervised dimensionality reduction with generalized linear models." Proceedings of the 25th international conference on Machine learning. ACM, 2008.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
rows = 100
cols = 10
set.seed(1)
mat_np = outer(rnorm(rows), rnorm(cols))

# generate a count matrix and binary response
mat = matrix(rpois(rows * cols, c(exp(mat_np))), rows, cols)
response = rbinom(rows, 1, rowSums(mat) / max(rowSums(mat)))

mod = genSupMF(mat, response, k = 1, alpha = 1, family_x = "poisson", family_y = "binomial",
               quiet = FALSE)

plot(inv.logit.mat(cbind(1, mod$A) %*% mod$beta), response)
plot(rowSums(mat), response)
## Not run: 
ggplot(data.frame(PC = mod$PCs[, 1], y = response), aes(PC, y)) + stat_summary_bin(bins = 10)

## End(Not run)

andland/genSupPCA documentation built on May 30, 2019, 11:43 a.m.