Description Usage Arguments Details Value Author(s) References See Also Examples
Fits a principal curve to a numeric multivariate dataset in arbitrary dimensions. Produces diagnostic plots.
1 2 3 4 5 6 7 8 9 10 11 | pcurve(x, xcan = NULL, start = "ca", rank = FALSE, cv.fit = FALSE,
penalty= 1, cv.all = FALSE, df = "vary", fit.meth = "spline",
canfit = "lm",candf = FALSE, vary.adj = FALSE, subset,
robust = FALSE, lowf = 0.5, min.df, max.df, max.df.cv.fit,
ext.dist = TRUE, ext.dc = 0.9, metric = "bray", latent = FALSE,
plot.pca = TRUE, thresh = 0.001, plot.true = TRUE,
plot.init = FALSE, plot.segs = TRUE, plot.resp = TRUE,
plot.cov = TRUE, maxit = 10, stretch = 2, fits = FALSE,
prnt.fits = TRUE, trace = TRUE, trace.all = FALSE, pch = 1,
row.chk0 = FALSE, col.chk0 = TRUE, use.loc = FALSE)
|
x |
numeric data matrix or data.frame. |
xcan |
data.frame or matrix of explanatory variables to be used in constrained PCs. |
start |
specifies how to determine the starting configuration (location of points on initial curve): "ca" = correspondence analysis; "pca" = principal components analysis with Euclidan metric; "pca.bc" = principal components analysis with Bray-Curtis metric; "mds" = non-metric multidimensional scaling with Euclidean metric; "mds.bc" = non-metric multidimensional scaling with Bray-Curtis metric; "cs.bc" = classical scaling (metric multidimensional scaling) with Bray-Curtis metric; "ran" = random start. Or if start is numeric and of length dim(x)[1] a user supplied configuration will be used. |
rank |
if TRUE starting configuration is transformed to rank |
cv.fit |
if TRUE a final iteration using cross-validation is done. |
penalty |
penalty for smoothing spline. A value of 1 corresponds to no penalty with values > 1 giving a less-smoothed fit. Increasing the penalty for small data sets can reduce over-fitting. If penalty = "np", penalty = 1 for N > 1000, penalty = 2 for N <=100, and penalty = 4-log(N, 10) for N > 100 and N <= 1000. |
cv.all |
if TRUE a cross-validated smoothing spline fit at each iteration. |
df |
if numeric specifies the df for the smoothing spline. |
fit.meth |
specifies smoother. "spline" = smooth.spline, "poisson" = poisson general additive model, "binomial" = binomial general additive model, "lowess" = lowess smoother (this argument overridden by robust = TRUE). |
canfit |
"lm" or "gam", model used to relate pc to xcan. |
candf |
if canfit = "gam", df for model. May be a single value or
a vector of FALSE or positive integers indicating dfs for each
explanatory variable in xcan. If FALSE, this is equivalent to
fx=FALSE in |
vary.adj |
if FALSE the same df are used for the smooth of each variable, otherwise each variable has its own df. |
subset |
used to take a subset of x and start (if numeric). |
robust |
if TRUE uses lowess smooths, if FALSE uses smoothing spline. |
lowf |
specifies the span of the lowess smooth. |
min.df |
specifies the min df for the smoothing. |
max.df |
specifies the max df for smoothing during cross-validation. |
max.df.cv.fit |
specifies the max df for the smoothing. |
ext.dist |
if TRUE extended dissimilarities in calculation of
initial configuration using the flexible shortest path. If FALSE
standard dissimilarites are used (see De'ath, 1999b and
|
ext.dc |
critical distance, the toolong argument in |
metric |
similarity metric, the method argument in |
latent |
if FALSE locations are rescaled after each iteration to give distance along the curve; if TRUE no rescaling is done. |
plot.pca |
if TRUE the fitting is plotted (assuming plot.true = TRUE) in the first 2 dimensions of PCA space. |
thresh |
threshold value of difference in cross-validation for ceasing iteration |
plot.true |
if TRUE the fitting process is plotted. |
plot.init |
if TRUE the initial fits to each variable are plotted. |
plot.segs |
if TRUE segments linking the fitted points on the curves to their corresponding data points are plotted. |
plot.resp |
if TRUE the final response curves are plotted. |
plot.cov |
if TRUE covariate partial effects are plotted (only if xcan is not null). |
maxit |
specifies the maximin number of iterations. |
stretch |
end segments of the curve are stretched by this factor at each iteration. |
fits |
if TRUE value of pcurve includes diagnostics for each variable. |
prnt.fits |
statistics on model fits printed. |
trace |
prints out useful fitting diagnostics at each iteration. |
trace.all |
if TRUE prints out all curve details at each iteration. |
pch |
symbol for plots |
row.chk0 |
if TRUE checks for and removes rows of x identically 0. |
col.chk0 |
if TRUE checks for and removes columns of x identically 0. |
use.loc |
if TRUE pauses during the fitting displays (left mouse-click to progress to next plot). |
See De'ath (1999a) for a full discussion of the functions and their application.
An object of class principal curve containing a list comprising
s |
fitted values |
tag |
order of points along the curve |
lambda |
locations along the curve |
dist |
sum of squared distances of points from the curve |
c |
call to pcurve |
x |
data to which the curve was fitted |
df |
degrees of freedom for the smoothers used in the fit |
fit.list |
diagnostics for each variable, only included if fits = TRUE. |
R port by Chris Walsh cwalsh@unimelb.edu.au from S+ library by Glenn De'ath g.death@aims.gov.au. Original S code for principal curve analysis by Trevor Hastie hastie@stat.stanford.edu.
De'ath, G. 1999a Principal Curves: a new technique for indirect and direct gradient analysis. Ecology 80, 2237–2253.
De'ath, G. 1999b Extended dissimilarity: method of robust estimation of ecological distances with high beta diversity. Plant Ecology 144, 191–199.
Gittins, R. 1985 Canonical Analysis. A review with applications in ecology. Berlin: Springer-Verlag.
Hastie, T.J and Tibshirani, R.J. 1990 Generalized additive models. London: Chapman and Hall.
Hastie, T.J. and Stuetzle, W. 1989 Principal Curves. Journal of the American Statistical Association 84, 502–516.
pcdiags.plt
, vegdist
, stepacross
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | #a simulated dataset with 4 response variables (taxa 1-4),
#n=100. The response curve is Gaussian and noise is Poisson.
data(sim4var)
sim4fit <- pcurve(sim4var, plot.init = FALSE, use.loc = TRUE)
#Limestone grassland community example worked by De'ath (1999a),
#from data in Gittins (1985)
data(soilspec)
species <- sqrt(soilspec[,2:9])
envvar <- soilspec[,10:12]
#indirect gradient analysis
spec.fit <- pcurve(species, start = "mds.bc", plot.init = FALSE,
use.loc = TRUE)
#direct gradient analysis
soilspec.fit <- pcurve(species, xcan = envvar,
start = "mds.bc", plot.init = FALSE,
fits = TRUE, prnt.fits = TRUE,
use.loc = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.