ltm.ecol: Ecological latent trait model

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Fits the latent trait model for species presence-absence community data described by Walker and Jackson (2011).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
ltm.ecol(Y, full = TRUE, cvic = FALSE, g = 10, 
	lambda = c(0.1,1.1,2.1,3.1,4.1), maxit = 200, 
	digits = 1, verbose = TRUE, B0 = NULL)

ltm.ecol.fit(Y, X, q2, Yv = Y, lambda = c(0.1,1.1,2.1,3.1,4.1),
	maxit = 200, digits = 1, verbose = TRUE, B0 = NULL,
	constr = c("none","none","neg","none","neg"), n = nrow(Y), m = ncol(Y))

## S3 method for class 'ltm.ecol'
print(x, digits = 1, which.lambda = "optimum", ...)

## S3 method for class 'ltm.ecol'
screeplot(x, which.lambdas, lambda.cex = 1, lambda.las = 2,
	reg.cex = 0.7, print.reg = TRUE, xlab, ylab, ...)

## S3 method for class 'ltm.ecol'
predict(object, newdata, responses0 = numeric(0),
	responses1 = c(1), predictors = (1:object$m)[-c(responses0,responses1)],
	which.lambda = "optimum", ...)

## S3 method for class 'ltm.ecol'
scores(x, newdata, which.lambda = "optimum", which.spp = 1:x$m, ...)

## S3 method for class 'ltm.ecol'
boxplot(x, newdata = NULL, responses0 = numeric(0),
	 responses1 = 1, predictors = (1:x$m)[-c(responses0,responses1)],
	which.lambda = "optimum", ...)

## S3 method for class 'ltm.ecol'
biplot(x, newdata, spp = 1:object$m,
	arrows = "none", onespp = FALSE, which.lambda = "optimum",
	g = 75, cex.sites = 1, cex.newsites = 1, ...)

Arguments

Y

Sites by species data frame (or matrix).

full

If TRUE, each lambda is used to fit to the full data set.

cvic

If TRUE, each lambda is used to fit to each data set with one observation missing in order to calculate the cross-validation information criterion (CVIC).

g

Number of lattice points along each axis. For the biplot method, this should be set higher to get better resolution (usually the default of 75 is adequate). In contrast, for ltm.ecol, we have found that in practice it is unnecessary to set g much higher than 10.

lambda

Vector of regularization parameters. It is STRONGLY recommended that these be given in increasing order.

maxit

Maximum number of EM iterations per model.

digits

Number of digits to retain in the rounding of the log-likelihood when determining convergence. For print.ltm.ecol, digits is the number of digits with which to round estimated model parameters.

verbose

If TRUE, information about the fitting procedure is displayed.

B0

Optional initial value for the coefficient matrix. If NULL, the default, then correspondence analysis and least-squares are used to (determinstically) generate a starting value.

X

The matrix representing the ordination space (i.e. the lattice). Most users should not need to consider this argument because ltm.ecol.fit is not meant to be called directly.

q2

The marginal probability for each point in the ordination space. Most users should not need to consider this argument because ltm.ecol.fit is not meant to be called directly.

Yv

Validation data. Most users should not need to consider this argument because ltm.ecol.fit is not meant to be called directly.

constr

Which constraints to pass to LASSO thresholding. Most users should not need to consider this argument because ltm.ecol.fit is not meant to be called directly.

n

Number of sites. Most users should not need to consider this argument because ltm.ecol.fit is not meant to be called directly.

m

Number of species. Most users should not need to consider this argument because ltm.ecol.fit is not meant to be called directly.

x

fitted ltm.ecol object.

which.lambdas

Optional vector giving subscripts indexing models to be used. For example, which.lambdas=c(1,2,4) indicates that the first two and fourth models are to be used but not the third model. Each model corresponds to a different value of the regularization parameter, lambda, and are numbered in the order in which they are supplied to ltm.ecol.

lambda.cex

Graphics parameter for the lambdas.

lambda.las

Graphics parameter for the lambdas.

reg.cex

Graphics parameter for the lambda title.

print.reg

If TRUE, a second horizontal axis is plotted giving the lambda values.

xlab

x-axis label for screeplots. A standard label is plotted if missing.

ylab

y-axis label for screeplots. A standard label is plotted if missing.

object

fitted ltm.ecol object.

newdata

A data frame containing validation data and the same species (i.e. variables) and species names (i.e. colnames).

responses0

Vector of indices of species to be predicted as absent.

responses1

Vector of indices of species to be predicted as present.

predictors

Vector of indices of species to be used as predictors. If predictors is NULL, then the marginal distribution is used.

which.lambda

If "optimum", then the model with lowest CIVC is used for predictions, otherwise an index specifying the model to be used.

which.spp

Vector of indices of species to be used as conditioning variables in calculating axis scores.

spp

Vector of indices of species to be represented in the biplot.

arrows

If "input" the user can use the mouse to draw arrows and these arrow positions will be saved in the object returned by biplot.ltm.ecol; if "none" no arrows are plotted; if a list returned by a call to biplot.ltm.ecol with arrows = "input" then the loaded arrows will be drawn.

onespp

If FALSE then the species in spp are represented by their 0.5 probability contour line; if TRUE then only one species will be plotted with a variety of probability contour lines, giving more detail about that single species.

cex.sites

Size of the site points.

cex.newsites

Size of the new site points.

...

Additional arguments.

Details

This function uses an EM-algorithm with LASSO (i.e. L1) regularization described by Walker and Jackson (2011). Our algorithm is based on the ideas of Bock and Aitkin (1981); Woodruff and Hanson (1996); Rizopoulos (2006); Houseman et al. (2007); and Friedman et al. (2010).

There is no guarantee that this algorithm will converge to a global optimum, but it is guaranteed to improve the initial value, B0, which by default is the correspondence analysis approximation (due to ter Braak 1985) of the maximum likelihood estimates for a fixed-effects version of this model.

Run times can be very long for large data sets, especially when computing CVIC (i.e. cvic=TRUE). If computing time is deemed too long, we offer three possible courses of action: (1) reduce digits (resulting in convergence further from the local / global optimum), (2) reduce the number of lambda values, (3) simply use lambda=1 (which is a value that we have found works well for a wide variety of ecological data).

ltm.ecol.fit is not meant to be called directly.

In biplot, observational units used to fit the model are plotted as circles and those provided as newdata are plotted as squares.

Value

An object of class ltm.ecol with components,

B

A list with each component giving the fitted coefficient matrix for each value of lambda, for the full data.

L

A vector with each element giving the log-likelihoods at convergence for each value of lambda, for the full data.

lambda

The supplied lambda values.

deviance

A vector with each element giving the deviances at convergence for each value of lambda, for the full data.

complexity

A vector with each element giving the number of non-zero parameters per species at convergence for each value of lambda, for the full data.

instab

A vector with each element corresponding to a particular lambda value. If an element = 1, it indicates that there was a numerical instability in estimation corresponding to that particular lambda and that the initial values of coefficients were returned rather than the coefficients at the time the algorithm was terminated. If an element = 0 it indicates no detected numerical instability.

cvic

A list with components described below.

opt.B

The fitted coefficient matrix for the optimal value of lambda.

Y

The supplied data matrix.

n

The number of sites.

m

The number of species.

X

The matrix representing the ordination space (lattice).

x

The points at which each axis takes values.

q

The marginal probability for each axis.

q2

The marginal probability for each point in the ordination space.

The ltm.ecol$cvic list has components,

cvic

A vector with each element giving the cvic at convergence for each value of lambda.

lambda

The supplied lambda values.

instab

See instab above.

logliks

A matrix giving the cross-validated log likelihoods for each cross-validation iteration (rows) and each value of lambda (columns).

Bjack

A list of lists containing the coefficient matrices for each iteration of leave-one-out cross-validation (inner list) and each value of lambda (outer list). These lists could be useful for computing jackknife estimates of uncertainty, although this is not done here.

opt.lambda

The lambda associated with the lowest cvic.

Author(s)

Steve C. Walker

References

Bock, R.D. and Aitkin, M. (1981) Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika. 46: 443-459.

Friedman, J.H., Hastie, T. and Tibshirani, R. (2010) Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33: 1-22.

Houseman, E.A. et al. (2007) Penalized item response theory models: Application to epigenetic alterations in bladder cancer. Biometrics. 63: 1269-1277.

Rizopoulos, D. (2006) ltm: An R package for latent variable modeling and item response theory analyses. Journal of Statistical Software 17: 1-25.

ter Braak, C.J.O (1985) Correspondence analysis of incidence and abundance data: properties in terms of a unimodal response model. Biometrics. 41: 859-873.

Walker, S.C. and Jackson, D.A. (2011) Random-effects ordination: describing and predicting multivariate correlations and co-occurrences. Ecological Monographs. 81: 635-663.

Woodruff, D.J. and Hanson, B.A. (1996) Estimation of item response models using the EM algorithm for finite mixtures. Technical report, ACT.

See Also

For another implementation of similar latent trait models in R, see ltm function in the ltm package. For random-effects ordination of multivariate-normal data see factanal.predictive.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# fit to Black Hollow fish data
fish.ltm <- ltm.ecol(fish,lambda=1)
print(fish.ltm,which.lambda=1)

# make a biplot with point size of lakes proportional to latitude and 
# contour lines for spp = 9.
spp <- 9
lat <- as.matrix(lat)
cex.sites <- 0.1+((lat - min(lat))/(max(lat)-min(lat)))
biplot(fish.ltm,onespp=TRUE,spp=spp,which.lambda=1,
	cex.sites=cex.sites,cex.newsites=0)

# make a boxplot of fitted probabilities of occurrence for spp = 9.
boxplot(fish.ltm,responses1=spp,
	xlab=paste("Fitted probability of occurrence of species",
	colnames(fish)[spp]),which.lambda=1)

# fit to BCI tree data
data(BCI)
BCI <- (BCI>0)*1
BCI.ltm <- ltm.ecol(BCI,lambda=1)

# plot estimated axis scores on a map
xhat <- scores(BCI.ltm,which.lambda=1)
UTM.EW <- rep(seq(625754, 626654, by=100), each=5)
UTM.NS <- rep(seq(1011569,  1011969, by=100), len=50)
ax <- 1
plot(UTM.EW,UTM.NS,type="n")
points(UTM.EW[xhat[,ax]>0],UTM.NS[xhat[,ax]>0],
	pch=16,cex=xhat[xhat[,ax]>0,ax],col="red")
points(UTM.EW[xhat[,ax]<0],UTM.NS[xhat[,ax]<0],
	pch=16,cex=-xhat[xhat[,ax]<0,ax],col="blue")
title(main=paste("Axis",ax,"in space (red=pos, blue=neg, size=magnitude)"))

reo documentation built on May 2, 2019, 4:55 p.m.

Related to ltm.ecol in reo...