gpls.formula: Projection to Latent Structues for Generalized Linear Models

Description Usage Arguments Details Value Examples

View source: R/PLS.R

Description

Projection to Latent Structues for Generalized Linear Models

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## S3 method for class 'formula'
gpls(
  formula,
  data,
  ncomp = NULL,
  eps = 0.001,
  maxit = 100,
  denom.eps = 1e-20,
  family = NULL,
  link = NULL,
  firth = FALSE,
  contrasts = NULL,
  ...
)

Arguments

formula

model formula

data

a data frame

ncomp

number of components to retain

eps

tolerance

maxit

max iter

denom.eps

tolerance value for denominator to consider a number as zero

family

"gaussian", "poisson", "negative.binomial", "binomial", "multinom", "Gamma", "inverse.gaussian"

link

the link function. see details for available options.

firth

should Firth's bias correction be applied? defaults to FALSE.

contrasts

model contrasts

...

other

Details

This function implements what is often called partial least squares for generalized linear models. However, Swedish statisticians Herman Wold and Svante Wold, who invented the method, maintain that the proper name is projection to latent structures. This name is used here because it would be improper to call generalized linear models by the name least squares. As the name implies, PLS works by projecting the predictor matrix to a lower dimensional subspace comprised of latent factors (in the sense of factor analysis, not categorical variables). Essentially, this can save an analytic step. Instead of asking "what factors underlie my variables", and then using the factor scores as predictors, PLS directly answers the question of "what latent factors explain my outcome variable." The returned regression coefficients correspond to the original set of explanatory variables, facilitating inference about the original variables if being used for reasons other than a factor analytic method.

PLS regression is useful for a variety of circumstances. These include the following:



Several likelihood functions are implemented here. These include the gaussian, poisson, binomial, gamma, inverse gaussian, and negative binomial distributions. The gaussian distribution is naturally ideal for continuous data. The binomial distribution is utilized for binary outcomes, while the multinomial distribution can model multiple outcomes. The poisson and negative binomial distributions are appropriate for integer count data, with the negative binomial being well suited for overdispersed counts. The gamma distribution and inverse gaussian distributions are appropriate for continuous data with positive support, with the gamma assuming a constant coefficient of variation and the inverse gaussian being suitable for heteroskedastic and/or highly skewed data.

The following link functions are available for each distribution:

Gaussian: "identity"
Binomial & Multinomial: "logit", "probit", "cauchit", "robit" (Student T with 3 df), and "cloglog"
Poisson & Negative Binomial: "log"
Gamma: "inverse" (1 / x)
Inverse Gaussian: "1/mu^2" (1/x^2)

Value

a gpls object

a gpls object containing the model fit, factor loadings, linear predictors, fitted values, and an assortment of other things.

Examples

1
gpls(y ~ ., data)

abnormally-distributed/cvreg documentation built on May 3, 2020, 3:45 p.m.