ltm.ecol: Ecological latent trait model
In reo: Random effects ordination

Description Usage Arguments Details Value Author(s) References See Also Examples

Fits the latent trait model for species presence-absence community data described by Walker and Jackson (2011).

ltm.ecol(Y, full = TRUE, cvic = FALSE, g = 10, 
	lambda = c(0.1,1.1,2.1,3.1,4.1), maxit = 200, 
	digits = 1, verbose = TRUE, B0 = NULL)

ltm.ecol.fit(Y, X, q2, Yv = Y, lambda = c(0.1,1.1,2.1,3.1,4.1),
	maxit = 200, digits = 1, verbose = TRUE, B0 = NULL,
	constr = c("none","none","neg","none","neg"), n = nrow(Y), m = ncol(Y))

## S3 method for class 'ltm.ecol'
print(x, digits = 1, which.lambda = "optimum", ...)

## S3 method for class 'ltm.ecol'
screeplot(x, which.lambdas, lambda.cex = 1, lambda.las = 2,
	reg.cex = 0.7, print.reg = TRUE, xlab, ylab, ...)

## S3 method for class 'ltm.ecol'
predict(object, newdata, responses0 = numeric(0),
	responses1 = c(1), predictors = (1:object$m)[-c(responses0,responses1)],
	which.lambda = "optimum", ...)

## S3 method for class 'ltm.ecol'
scores(x, newdata, which.lambda = "optimum", which.spp = 1:x$m, ...)

## S3 method for class 'ltm.ecol'
boxplot(x, newdata = NULL, responses0 = numeric(0),
	 responses1 = 1, predictors = (1:x$m)[-c(responses0,responses1)],
	which.lambda = "optimum", ...)

## S3 method for class 'ltm.ecol'
biplot(x, newdata, spp = 1:object$m,
	arrows = "none", onespp = FALSE, which.lambda = "optimum",
	g = 75, cex.sites = 1, cex.newsites = 1, ...)

`Y`	Sites by species data frame (or matrix).
`full`	If TRUE, each lambda is used to fit to the full data set.
`cvic`	If TRUE, each lambda is used to fit to each data set with one observation missing in order to calculate the cross-validation information criterion (CVIC).
`g`	Number of lattice points along each axis. For the biplot method, this should be set higher to get better resolution (usually the default of 75 is adequate). In contrast, for `ltm.ecol`, we have found that in practice it is unnecessary to set `g` much higher than 10.
`lambda`	Vector of regularization parameters. It is STRONGLY recommended that these be given in increasing order.
`maxit`	Maximum number of EM iterations per model.
`digits`	Number of digits to retain in the rounding of the log-likelihood when determining convergence. For `print.ltm.ecol`, `digits` is the number of digits with which to round estimated model parameters.
`verbose`	If TRUE, information about the fitting procedure is displayed.
`B0`	Optional initial value for the coefficient matrix. If NULL, the default, then correspondence analysis and least-squares are used to (determinstically) generate a starting value.
`X`	The matrix representing the ordination space (i.e. the lattice). Most users should not need to consider this argument because `ltm.ecol.fit` is not meant to be called directly.
`q2`	The marginal probability for each point in the ordination space. Most users should not need to consider this argument because `ltm.ecol.fit` is not meant to be called directly.
`Yv`	Validation data. Most users should not need to consider this argument because `ltm.ecol.fit` is not meant to be called directly.
`constr`	Which constraints to pass to LASSO thresholding. Most users should not need to consider this argument because `ltm.ecol.fit` is not meant to be called directly.
`n`	Number of sites. Most users should not need to consider this argument because `ltm.ecol.fit` is not meant to be called directly.
`m`	Number of species. Most users should not need to consider this argument because `ltm.ecol.fit` is not meant to be called directly.
`x`	fitted `ltm.ecol` object.
`which.lambdas`	Optional vector giving subscripts indexing models to be used. For example, `which.lambdas=c(1,2,4)` indicates that the first two and fourth models are to be used but not the third model. Each model corresponds to a different value of the regularization parameter, `lambda`, and are numbered in the order in which they are supplied to `ltm.ecol`.
`lambda.cex`	Graphics parameter for the lambdas.
`lambda.las`	Graphics parameter for the lambdas.
`reg.cex`	Graphics parameter for the lambda title.
`print.reg`	If TRUE, a second horizontal axis is plotted giving the lambda values.
`xlab`	x-axis label for screeplots. A standard label is plotted if missing.
`ylab`	y-axis label for screeplots. A standard label is plotted if missing.
`object`	fitted `ltm.ecol` object.
`newdata`	A data frame containing validation data and the same species (i.e. variables) and species names (i.e. `colnames`).
`responses0`	Vector of indices of species to be predicted as absent.
`responses1`	Vector of indices of species to be predicted as present.
`predictors`	Vector of indices of species to be used as predictors. If `predictors` is `NULL`, then the marginal distribution is used.
`which.lambda`	If "optimum", then the model with lowest CIVC is used for predictions, otherwise an index specifying the model to be used.
`which.spp`	Vector of indices of species to be used as conditioning variables in calculating axis scores.
`spp`	Vector of indices of species to be represented in the biplot.
`arrows`	If "input" the user can use the mouse to draw arrows and these arrow positions will be saved in the object returned by `biplot.ltm.ecol`; if "none" no arrows are plotted; if a list returned by a call to `biplot.ltm.ecol` with `arrows = "input"` then the loaded arrows will be drawn.
`onespp`	If FALSE then the species in `spp` are represented by their 0.5 probability contour line; if TRUE then only one species will be plotted with a variety of probability contour lines, giving more detail about that single species.
`cex.sites`	Size of the site points.
`cex.newsites`	Size of the new site points.
`...`	Additional arguments.

This function uses an EM-algorithm with LASSO (i.e. L1) regularization described by Walker and Jackson (2011). Our algorithm is based on the ideas of Bock and Aitkin (1981); Woodruff and Hanson (1996); Rizopoulos (2006); Houseman et al. (2007); and Friedman et al. (2010).

There is no guarantee that this algorithm will converge to a global optimum, but it is guaranteed to improve the initial value, B0, which by default is the correspondence analysis approximation (due to ter Braak 1985) of the maximum likelihood estimates for a fixed-effects version of this model.

Run times can be very long for large data sets, especially when computing CVIC (i.e. cvic=TRUE). If computing time is deemed too long, we offer three possible courses of action: (1) reduce digits (resulting in convergence further from the local / global optimum), (2) reduce the number of lambda values, (3) simply use lambda=1 (which is a value that we have found works well for a wide variety of ecological data).

ltm.ecol.fit is not meant to be called directly.

In biplot, observational units used to fit the model are plotted as circles and those provided as newdata are plotted as squares.

An object of class ltm.ecol with components,

`B`	A list with each component giving the fitted coefficient matrix for each value of lambda, for the full data.
`L`	A vector with each element giving the log-likelihoods at convergence for each value of lambda, for the full data.
`lambda`	The supplied lambda values.
`deviance`	A vector with each element giving the deviances at convergence for each value of lambda, for the full data.
`complexity`	A vector with each element giving the number of non-zero parameters per species at convergence for each value of lambda, for the full data.
`instab`	A vector with each element corresponding to a particular lambda value. If an element = 1, it indicates that there was a numerical instability in estimation corresponding to that particular lambda and that the initial values of coefficients were returned rather than the coefficients at the time the algorithm was terminated. If an element = 0 it indicates no detected numerical instability.
`cvic`	A list with components described below.
`opt.B`	The fitted coefficient matrix for the optimal value of lambda.
`Y`	The supplied data matrix.
`n`	The number of sites.
`m`	The number of species.
`X`	The matrix representing the ordination space (lattice).
`x`	The points at which each axis takes values.
`q`	The marginal probability for each axis.
`q2`	The marginal probability for each point in the ordination space.

The ltm.ecol$cvic list has components,

`cvic`	A vector with each element giving the cvic at convergence for each value of lambda.
`lambda`	The supplied lambda values.
`instab`	See `instab` above.
`logliks`	A matrix giving the cross-validated log likelihoods for each cross-validation iteration (rows) and each value of lambda (columns).
`Bjack`	A list of lists containing the coefficient matrices for each iteration of leave-one-out cross-validation (inner list) and each value of lambda (outer list). These lists could be useful for computing jackknife estimates of uncertainty, although this is not done here.
`opt.lambda`	The lambda associated with the lowest cvic.

Steve C. Walker

Bock, R.D. and Aitkin, M. (1981) Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika. 46: 443-459.

Friedman, J.H., Hastie, T. and Tibshirani, R. (2010) Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33: 1-22.

Houseman, E.A. et al. (2007) Penalized item response theory models: Application to epigenetic alterations in bladder cancer. Biometrics. 63: 1269-1277.

Rizopoulos, D. (2006) ltm: An R package for latent variable modeling and item response theory analyses. Journal of Statistical Software 17: 1-25.

ter Braak, C.J.O (1985) Correspondence analysis of incidence and abundance data: properties in terms of a unimodal response model. Biometrics. 41: 859-873.

Walker, S.C. and Jackson, D.A. (2011) Random-effects ordination: describing and predicting multivariate correlations and co-occurrences. Ecological Monographs. 81: 635-663.

Woodruff, D.J. and Hanson, B.A. (1996) Estimation of item response models using the EM algorithm for finite mixtures. Technical report, ACT.

For another implementation of similar latent trait models in R, see ltm function in the ltm package. For random-effects ordination of multivariate-normal data see factanal.predictive.

# fit to Black Hollow fish data
fish.ltm <- ltm.ecol(fish,lambda=1)
print(fish.ltm,which.lambda=1)

# make a biplot with point size of lakes proportional to latitude and 
# contour lines for spp = 9.
spp <- 9
lat <- as.matrix(lat)
cex.sites <- 0.1+((lat - min(lat))/(max(lat)-min(lat)))
biplot(fish.ltm,onespp=TRUE,spp=spp,which.lambda=1,
	cex.sites=cex.sites,cex.newsites=0)

# make a boxplot of fitted probabilities of occurrence for spp = 9.
boxplot(fish.ltm,responses1=spp,
	xlab=paste("Fitted probability of occurrence of species",
	colnames(fish)[spp]),which.lambda=1)

# fit to BCI tree data
data(BCI)
BCI <- (BCI>0)*1
BCI.ltm <- ltm.ecol(BCI,lambda=1)

# plot estimated axis scores on a map
xhat <- scores(BCI.ltm,which.lambda=1)
UTM.EW <- rep(seq(625754, 626654, by=100), each=5)
UTM.NS <- rep(seq(1011569,  1011969, by=100), len=50)
ax <- 1
plot(UTM.EW,UTM.NS,type="n")
points(UTM.EW[xhat[,ax]>0],UTM.NS[xhat[,ax]>0],
	pch=16,cex=xhat[xhat[,ax]>0,ax],col="red")
points(UTM.EW[xhat[,ax]<0],UTM.NS[xhat[,ax]<0],
	pch=16,cex=-xhat[xhat[,ax]<0,ax],col="blue")
title(main=paste("Axis",ax,"in space (red=pos, blue=neg, size=magnitude)"))