IKFA | R Documentation |
Functions for reconstructing (predicting) environmental values from biological assemblages using Imbrie & Kipp Factor Analysis (IKFA), as used in palaeoceanography.
IKFA(y, x, nFact = 5, IsPoly = FALSE, IsRot = TRUE,
ccoef = 1:nFact, check.data=TRUE, lean=FALSE, ...)
IKFA.fit(y, x, nFact = 5, IsPoly = FALSE, IsRot = TRUE,
ccoef = 1:nFact, lean=FALSE)
## S3 method for class 'IKFA'
predict(object, newdata=NULL, sse=FALSE, nboot=100,
match.data=TRUE, verbose=TRUE, ...)
communality(object, y)
## S3 method for class 'IKFA'
crossval(object, cv.method="loo", verbose=TRUE, ngroups=10,
nboot=100, h.cutoff=0, h.dist=NULL, ...)
## S3 method for class 'IKFA'
performance(object, ...)
## S3 method for class 'IKFA'
rand.t.test(object, n.perm=999, ...)
## S3 method for class 'IKFA'
screeplot(x, rand.test=TRUE, ...)
## S3 method for class 'IKFA'
print(x, ...)
## S3 method for class 'IKFA'
summary(object, full=FALSE, ...)
## S3 method for class 'IKFA'
plot(x, resid=FALSE, xval=FALSE, nFact=max(x$ccoef),
xlab="", ylab="", ylim=NULL, xlim=NULL, add.ref=TRUE,
add.smooth=FALSE, ...)
## S3 method for class 'IKFA'
residuals(object, cv=FALSE, ...)
## S3 method for class 'IKFA'
coef(object, ...)
## S3 method for class 'IKFA'
fitted(object, ...)
y |
a data frame or matrix of biological abundance data. |
x , object |
a vector of environmental values to be modelled or an object of class |
newdata |
new biological data to be predicted. |
nFact |
number of factor to extract. |
IsRot |
logical to rotate factors. |
ccoef |
vector of factor numbers to include in the predictions. |
IsPoly |
logical to include quadratic of the factors as predictors in the regression. |
check.data |
logical to perform simple checks on the input data. |
match.data |
logical indicate the function will match two species datasets by their column names. You should only set this to |
lean |
logical to exclude some output from the resulting models (used when cross-validating to speed calculations). |
full |
logical to show head and tail of output in summaries. |
resid |
logical to plot residuals instead of fitted values. |
xval |
logical to plot cross-validation estimates. |
xlab , ylab , xlim , ylim |
additional graphical arguments to |
add.ref |
add 1:1 line on plot. |
add.smooth |
add loess smooth to plot. |
cv.method |
cross-validation method, either "loo", "lgo", "bootstrap" or "h-block". |
verbose |
logical to show feedback during cross-validation. |
nboot |
number of bootstrap samples. |
ngroups |
number of groups in leave-group-out cross-validation, or a vector contain leave-out group menbership. |
h.cutoff |
cutoff for h-block cross-validation. Only training samples greater than |
h.dist |
distance matrix for use in h-block cross-validation. Usually a matrix of geographical distances between samples. |
sse |
logical indicating that sample specific errors should be calculated. |
rand.test |
logical to perform a randomisation t-test to test significance of cross validated factors. |
n.perm |
number of permutations for randomisation t-test. |
cv |
logical to indicate model or cross-validation residuals. |
... |
additional arguments. |
Function IKFA
performs Imbrie and Kipp Factor Analysis, a form of Principal Components Regrssion (Imbrie & Kipp 1971).
Function predict
predicts values of the environemntal variable for newdata
or returns the fitted (predicted) values from the original modern dataset if newdata
is NULL
. Variables are matched between training and newdata by column name (if match.data
is TRUE
). Use compare.datasets
to assess conformity of two species datasets and identify possible no-analogue samples.
IKFA
has methods fitted
and rediduals
that return the fitted values (estimates) and residuals for the training set, performance
, which returns summary performance statistics (see below), coef
which returns the species coefficients, and print
and summary
to summarise the output. IKFA
also has a plot
method that produces scatter plots of predicted vs observed measurements for the training set.
Function rand.t.test
performs a randomisation t-test to test the significance of the cross-validated components after van der Voet (1994).
Function screeplot
displays the RMSE of prediction for the training set as a function of the number of factors and is useful for estimating the optimal number for use in prediction. By default screeplot
will also carry out a randomisation t-test and add a line to scree plot indicating percentage change in RMSE with each component annotate with the p-value from the randomisation test.
Function IKFA
returns an object of class IKFA
with the following named elements:
coefficients |
species coefficients (the updated "optima"). |
fitted.values |
fitted values for the training set. |
call |
original function call. |
x |
environmental variable used in the model. |
standx , meanT sdx |
additional information returned for a PLSif model. |
Function crossval
also returns an object of class IKFA
and adds the following named elements:
predicted |
predicted values of each training set sample under cross-validation. |
residuals.cv |
prediction residuals. |
If function predict
is called with newdata=NULL
it returns the fitted values of the original model, otherwise it returns a list with the following named elements:
fit |
predicted values for |
If sample specific errors were requested the list will also include:
fit.boot |
mean of the bootstrap estimates of newdata. |
v1 |
standard error of the bootstrap estimates for each new sample. |
v2 |
root mean squared error for the training set samples, across all bootstram samples. |
SEP |
standard error of prediction, calculated as the square root of v1^2 + v2^2. |
Function performance
returns a matrix of performance statistics for the IKFA model. See performance
, for a description of the summary.
Function rand.t.test
returns a matrix of performance statistics together with columns indicating the p-value and percentage change in RMSE with each higher component (see van der Veot (1994) for details).
Steve Juggins
Imbrie, J. & Kipp, N.G. (1971). A new micropaleontological method for quantitative paleoclimatology: application to a Late Pleistocene Caribbean core. In The Late Cenozoic Glacial Ages (ed K.K. Turekian), pp. 77-181. Yale University Press, New Haven.
van der Voet, H. (1994) Comparing the predictive accuracy of models uing a simple randomization test. Chemometrics and Intelligent Laboratory Systems, 25, 313-323.
WA
, MAT
, performance
, and compare.datasets
for diagnostics.
data(IK)
spec <- IK$spec
SumSST <- IK$env$SumSST
core <- IK$core
fit <- IKFA(spec, SumSST)
fit
# cross-validate model
fit.cv <- crossval(fit, cv.method="lgo")
# How many components to use?
screeplot(fit.cv)
#predict the core
pred <- predict(fit, core, npls=2)
#plot predictions - depths are in rownames
depth <- as.numeric(rownames(core))
plot(depth, pred$fit[, 2], type="b")
# fit using only factors 1, 2, 4, & 5
# and using polynomial terms
# as Imbrie & Kipp (1971)
fit2 <- IKFA(spec, SumSST, ccoef=c(1, 2, 4, 5), IsPoly=TRUE)
fit2.cv <- crossval(fit2, cv.method="lgo")
screeplot(fit2.cv)
## Not run:
# predictions with sample specific errors
# takes approximately 1 minute to run
pred <- predict(fit, core, sse=TRUE, nboot=1000)
pred
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.