genSupPCA: Supervised dimensionality reduction for exponential family...

Description Usage Arguments Value Examples

View source: R/gen_sup_pca.R

Description

Supervised dimensionality reduction for exponential family data

Usage

1
2
3
4
5
genSupPCA(x, y, k = 2, alpha = NULL, m = 4, family_x = c("gaussian",
  "binomial", "poisson", "multinomial"), family_y = c("gaussian", "binomial",
  "poisson"), quiet = TRUE, max_iters = 100, max_iters_per = 3,
  conv_criteria = 1e-05, discrete_deriv = FALSE, init = c("svd", "pls",
  "random"), start_U, mu, start_beta, grassmann = FALSE)

Arguments

x

covariate matrix

y

response vector

k

dimension

alpha

balance between dimensionality reduction of x and prediction of y

m

value to approximate the saturated model in dimensionality reduction

family_x

exponential family distribution of covariates

family_y

exponential family distribution of response

quiet

logical; whether the calculation should give feedback

max_iters

maximum number of iterations

max_iters_per

maximum iterations within each iteration

conv_criteria

convergence criteria

discrete_deriv

whether to calculate discrete derivatives w.r.t U instead of the closed form derivative with beta held constant

init

how to initialize U. svd uses the first k right singular vectors of x. pls uses the partial least squares loadings. random randomly initializes. This is ignored if start_U is specified

start_U

initial value for U

mu

specific value for mu, the mean vector of x

start_beta

initial value for beta

grassmann

logical; wether to optimize U on the Grassmann manifold. If FALSE, will optimize U on the Stiefel manifold

Value

An S3 object of class gspca which is a list with the following components:

mu

the main effects for dimensionality reduction

U

the k-dimentional orthonormal matrix with the loadings

beta

the k + 1 length vector of the coefficients

PCs

the princial component scores

m

the parameter inputed

family_x

the exponential family of covariates

family_y

the exponential family of response

iters

number of iterations required for convergence

loss_trace

the trace of the average negative log likelihood of the algorithm. Should be non-increasing

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
rows = 100
cols = 10
set.seed(1)
mat_np = outer(rnorm(rows), rnorm(cols))

# generate a count matrix and binary response
mat = matrix(rpois(rows * cols, c(exp(mat_np))), rows, cols)
response = rbinom(rows, 1, rowSums(mat) / max(rowSums(mat)))

mod = genSupPCA(mat, response, k = 1, alpha = 0, family_x = "poisson", family_y = "binomial",
                quiet = FALSE, init = "pls")

plot(inv.logit.mat(cbind(1, mod$PCs) %*% mod$beta), response)
plot(rowSums(mat), response)
## Not run: 
ggplot(data.frame(PC = mod$PCs[, 1], y = response), aes(PC, y)) + stat_summary_bin(bins = 10)

## End(Not run)

andland/genSupPCA documentation built on May 30, 2019, 11:43 a.m.