Description Usage Arguments Details Value Examples
Projection to Latent Structues for Generalized Linear Models
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
formula |
model formula |
data |
a data frame |
ncomp |
number of components to retain |
eps |
tolerance |
maxit |
max iter |
denom.eps |
tolerance value for denominator to consider a number as zero |
family |
"gaussian", "poisson", "negative.binomial", "binomial", "multinom", "Gamma", "inverse.gaussian" |
link |
the link function. see details for available options. |
firth |
should Firth's bias correction be applied? defaults to FALSE. |
contrasts |
model contrasts |
... |
other |
This function implements what is often called partial least squares for generalized
linear models. However, Swedish statisticians Herman Wold and Svante Wold, who invented
the method, maintain that the proper name is projection to latent structures. This name is
used here because it would be improper to call generalized linear models by the name
least squares. As the name implies, PLS works by projecting the predictor matrix to a lower
dimensional subspace comprised of latent factors (in the sense of factor analysis, not categorical variables).
Essentially, this can save an analytic step. Instead of asking "what factors underlie my variables",
and then using the factor scores as predictors, PLS directly answers the question of "what latent factors explain
my outcome variable." The returned regression coefficients correspond to the original set of explanatory
variables, facilitating inference about the original variables if being used for reasons other than
a factor analytic method.
PLS regression is useful for a variety of circumstances. These include the following:
multicollinear variables.
variables believed to be measures of an underlying latent factor (which typically entails multicollinearity)
regression problems where there are more predictor variables than observations (deficient rank)
minimizing prediction error in a manner similar to ridge regression
recovering a set of factors that explain an outcome, a sort of "supervised factor analysis".
Several likelihood functions are implemented here. These include the gaussian, poisson, binomial,
gamma, inverse gaussian, and negative binomial distributions. The gaussian distribution is naturally
ideal for continuous data. The binomial distribution is utilized for binary outcomes, while the
multinomial distribution can model multiple outcomes. The poisson and negative binomial distributions
are appropriate for integer count data, with the negative binomial being well suited for overdispersed
counts. The gamma distribution and inverse gaussian distributions are appropriate for continuous data
with positive support, with the gamma assuming a constant coefficient of variation and the inverse
gaussian being suitable for heteroskedastic and/or highly skewed data.
The following link functions are available for each distribution:
Gaussian: "identity"
Binomial & Multinomial: "logit", "probit", "cauchit", "robit" (Student T with 3 df), and "cloglog"
Poisson & Negative Binomial: "log"
Gamma: "inverse" (1 / x)
Inverse Gaussian: "1/mu^2" (1/x^2)
a gpls object
a gpls object containing the model fit, factor loadings, linear predictors, fitted values, and an assortment of other things.
1 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.