MAT: Palaeoenvironmental reconstruction using the Modern Analogue...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/MAT.r

Description

Functions for reconstructing (predicting) environmental values from biological assemblages using the Modern Analogue Technique (MAT), also know as k nearest neighbours (k-NN).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
MAT(y, x, dist.method="sq.chord", k=5, lean=TRUE)

## S3 method for class 'MAT'
predict(object, newdata=NULL, k=object$k, sse=FALSE, 
        nboot=100, match.data=TRUE, verbose=TRUE, lean=TRUE, 
        ...)

## S3 method for class 'MAT'
performance(object, ...)

## S3 method for class 'MAT'
crossval(object, k=object$k, cv.method="lgo", 
        verbose=TRUE, ngroups=10, nboot=100, h.cutoff=0, h.dist=NULL, ...)

## S3 method for class 'MAT'
print(x, ...)

## S3 method for class 'MAT'
summary(object, full=FALSE, ...)

## S3 method for class 'MAT'
plot(x, resid=FALSE, xval=FALSE, k=5, wMean=FALSE, xlab="", 
      ylab="", ylim=NULL, xlim=NULL, add.ref=TRUE,
      add.smooth=FALSE, ...)

## S3 method for class 'MAT'
residuals(object, cv=FALSE, ...)

## S3 method for class 'MAT'
fitted(object, ...)

## S3 method for class 'MAT'
screeplot(x, ...)

paldist(y, dist.method="sq.chord")

paldist2(y1, y2, dist.method="sq.chord")

Arguments

y, y1, y2

data frame containing biological data.

newdata

data frame containing biological data to predict from.

x

a vector of environmental values to be modelled, matched to y.

dist.method

dissimilarity coefficient. See details for options.

match.data

logical indicate the function will match two species datasets by their column names. You should only set this to FALSE if you are sure the column names match exactly.

k

number of analogues to use.

lean

logical to remove items form the output.

object

an object of class MAT.

resid

logical to plot residuals instead of fitted values.

xval

logical to plot cross-validation estimates.

wMean

logical to plot weighted-mean estimates.

xlab, ylab, xlim, ylim

additional graphical arguments to plot.wa.

add.ref

add 1:1 line on plot.

add.smooth

add loess smooth to plot.

cv.method

cross-validation method, either "lgo", "bootstrap" or "h-block".

verbose

logical to show feedback during cross-validation.

nboot

number of bootstrap samples.

ngroups

number of groups in leave-group-out cross-validation, or a vector contain leave-out group menbership.

h.cutoff

cutoff for h-block cross-validation. Only training samples greater than h.cutoff from each test sample will be used.

h.dist

distance matrix for use in h-block cross-validation. Usually a matrix of geographical distances between samples.

sse

logical indicating that sample specific errors should be calculated.

full

logical to indicate a full or abbreviated summary.

cv

logical to indicate model or cross-validation residuals.

...

additional arguments.

Details

MAT performs an environmental reconstruction using the modern analogue technique. Function MAT takes a training dataset of biological data (species abundances) y and a single associated environmental variable x, and generates a model of closest analogues, or matches, for the modern data data using one of a number of dissimilarity coefficients. Options for the latter are: "euclidean", "sq.euclidean", "chord", "sq.chord", "chord.t", "sq.chord.t", "chi.squared", "sq.chi.squared", "bray". "chord.t" are true chord distances, "chord" refers to the the variant of chord distance using in palaeoecology (e.g. Overpeck et al. 1985), which is actually Hellinger's distance (Legendre & Gallagher 2001). There are various help functions to plot and extract information from the results of a MAT transfer function. The function predict takes MAT object and uses it to predict environmental values for a new set of species data, or returns the fitted (predicted) values from the original modern dataset if newdata is NULL. Variables are matched between training and newdata by column name (if match.data is TRUE). Use compare.datasets to assess conformity of two species datasets and identify possible no-analogue samples.

MAT has methods fitted and rediduals that return the fitted values (estimates) and residuals for the training set, performance, which returns summary performance statistics (see below), and print and summary to summarise the output. MAT also has a plot method that produces scatter plots of predicted vs observed measurements for the training set.

Function screeplot displays the RMSE of prediction for the training set as a function of the number of analogues (k) and is useful for estimating the optimal value of k for use in prediction.

paldist and paldist1 are helper functions though they may be called directly. paldist takes a single data frame or matrix returns a distance matrix of the row-wise dissimilarities. paldist2 takes two data frames of matrices and returns a matrix of all row-wise dissimilarities between the two datasets.

Value

Function MAT returns an object of class MAT which contains the following items:

call

original function call to MAT.

fitted.vales

fitted (predicted) values for the training set, as the mean and weighted mean (weighed by dissimilarity) of the k closest analogues.

diagnostics

standard deviation of the k analogues and dissimilarity of the closest analogue.

dist.n

dissimilarities of the k closest analogues.

x.n

environmental values of the k closest analogues.

match.name

column names of the k closest analogues.

x

environmental variable used in the model.

dist.method

dissimilarity coefficient.

k

number of closest analogues to use.

y

original species data.

cv.summary

summary of the cross-validation (not yet implemented).

dist

dissimilarity matrix (returned if lean=FALSE).

If function predict is called with newdata=NULL it returns a matrix of fitted values from the original training set analysis. If newdata is not NULL it returns list with the following named elements:

fit

predictions for newdata.

diagnostics

standard deviations of the k closest analogues and distance of closest analogue.

dist.n

dissimilarities of the k closest analogues.

x.n

environmental values of the k closest analogues.

match.name

column names of the k closest analogues.

dist

dissimilarity matrix (returned if lean=FALSE).

If sample specific errors were requested the list will also include:

fit.boot

mean of the bootstrap estimates of newdata.

v1

standard error of the bootstrap estimates for each new sample.

v2

root mean squared error for the training set samples, across all bootstram samples.

SEP

standard error of prediction, calculated as the square root of v1^2 + v2^2.

Functions paldist and paldist2 return dissimilarity matrices. performance returns a matrix of performance statistics for the MAT model, with columns for RMSE, R2, mean and max bias for each number of analogues up to k. See performance for a description of the output.

Author(s)

Steve Juggins

References

Legendre, P. & Gallagher, E. (2001) Ecologically meaningful transformations for ordination of species. Oecologia, 129, 271-280.

Overpeck, J.T., Webb, T., III, & Prentice, I.C. (1985) Quantitative interpretation of fossil pollen spectra: dissimilarity coefficients and the method of modern analogs. Quaternary Research, 23, 87-108.

See Also

WAPLS, WA, performance, and compare.datasets for diagnostics.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# pH reconstruction of the RLGH, Scotland, using SWAP training set 
# shows recent acidification history
data(SWAP)
data(RLGH)
fit <- MAT(SWAP$spec, SWAP$pH, k=20)  # generate results for k 1-20
#examine performance
performance(fit)
print(fit)
# How many analogues?
screeplot(fit)
# do the reconstruction
pred.mat <- predict(fit, RLGH$spec, k=10)
# plot the reconstruction
plot(RLGH$depths$Age, pred.mat$fit[, 1], type="b", ylab="pH", xlab="Age")

#compare to a weighted average model
fit <- WA(SWAP$spec, SWAP$pH)
pred.wa <- predict(fit, RLGH$spec)
points(RLGH$depths$Age, pred.wa$fit[, 1], col="red", type="b")
legend("topleft", c("MAT", "WA"), lty=1, col=c("black", "red"))

Example output

This is rioja 0.9-21
$RMSE0
[1] 0.7694898

$object
            RMSE        R2   Avg.Bias  Max.Bias    Skill
N01    0.4227136 0.7139018 0.02543713 0.3973333 69.82226
N02    0.3740649 0.7701885 0.04932934 0.4689167 76.36867
N03    0.3386965 0.8088455 0.03792415 0.4033889 80.62616
N04    0.3281850 0.8200202 0.03346108 0.4438333 81.81004
N05    0.3136441 0.8355849 0.02872814 0.4124333 83.38621
N06    0.3072492 0.8443662 0.03855389 0.4151667 84.05679
N07    0.3167268 0.8363733 0.04813088 0.4178810 83.05803
N08    0.3065426 0.8473774 0.04333608 0.4130000 84.13003
N09    0.3048527 0.8495367 0.04361277 0.4110741 84.30453
N10    0.3014974 0.8547771 0.04728683 0.4083333 84.64812
N11    0.3042808 0.8517789 0.04706913 0.4357273 84.36337
N12    0.3127547 0.8443953 0.05072056 0.4335000 83.48030
N13    0.3114684 0.8485511 0.05493183 0.4442564 83.61591
N14    0.3141377 0.8474564 0.05704149 0.4725119 83.33388
N15    0.3151628 0.8476685 0.05496527 0.4927444 83.22493
N16    0.3182608 0.8462633 0.05815644 0.5126458 82.89353
N17    0.3191509 0.8478043 0.06160197 0.5221078 82.79771
N18    0.3192246 0.8496456 0.06396108 0.5313333 82.78975
N19    0.3235022 0.8478382 0.06735109 0.5517982 82.32544
N20    0.3261004 0.8464416 0.06705838 0.5619667 82.04039
N01.wm 0.4227136 0.7139018 0.02543713 0.3973333 69.82226
N02.wm 0.3711241 0.7733823 0.04760236 0.4613843 76.73877
N03.wm 0.3375492 0.8102076 0.03845227 0.4088414 80.75719
N04.wm 0.3271615 0.8212919 0.03464252 0.4432979 81.92332
N05.wm 0.3144147 0.8348247 0.02975297 0.4204806 83.30448
N06.wm 0.3076881 0.8435056 0.03711268 0.4252744 84.01120
N07.wm 0.3147571 0.8377312 0.04509658 0.4249791 83.26809
N08.wm 0.3049275 0.8482811 0.04065341 0.4206146 84.29682
N09.wm 0.3035249 0.8499857 0.04077227 0.4204672 84.44095
N10.wm 0.3004893 0.8545816 0.04423961 0.4180487 84.75061
N11.wm 0.3024601 0.8525684 0.04422518 0.4367043 84.54993
N12.wm 0.3095924 0.8463617 0.04729194 0.4342075 83.81268
N13.wm 0.3081859 0.8500700 0.05092721 0.4436425 83.95943
N14.wm 0.3105593 0.8489667 0.05287554 0.4647638 83.71141
N15.wm 0.3110852 0.8495811 0.05136803 0.4803950 83.65620
N16.wm 0.3134112 0.8487691 0.05406664 0.4980537 83.41088
N17.wm 0.3138147 0.8503993 0.05700899 0.5061368 83.36814
N18.wm 0.3136284 0.8522520 0.05906668 0.5142370 83.38788
N19.wm 0.3169733 0.8509667 0.06191976 0.5308091 83.03165
N20.wm 0.3189742 0.8499905 0.06153679 0.5398011 82.81675


Method : Modern Analogue Technique
Call   : MAT(y = y, x = x, dist.method = "sq.chord", k = 20, lean = TRUE) 

Distance : sq.chord 
No. samples        : 167 
No. species        : 277 
Cross val.         : none 


Performance:
          RMSE      R2  Avg.Bias  Max.Bias    Skill
N01     0.4227  0.7139    0.0254    0.3973  69.8223
N02     0.3741  0.7702    0.0493    0.4689  76.3687
N03     0.3387  0.8088    0.0379    0.4034  80.6262
N04     0.3282  0.8200    0.0335    0.4438  81.8100
N05     0.3136  0.8356    0.0287    0.4124  83.3862
N06     0.3072  0.8444    0.0386    0.4152  84.0568
N07     0.3167  0.8364    0.0481    0.4179  83.0580
N08     0.3065  0.8474    0.0433    0.4130  84.1300
N09     0.3049  0.8495    0.0436    0.4111  84.3045
N10     0.3015  0.8548    0.0473    0.4083  84.6481
N11     0.3043  0.8518    0.0471    0.4357  84.3634
N12     0.3128  0.8444    0.0507    0.4335  83.4803
N13     0.3115  0.8486    0.0549    0.4443  83.6159
N14     0.3141  0.8475    0.0570    0.4725  83.3339
N15     0.3152  0.8477    0.0550    0.4927  83.2249
N16     0.3183  0.8463    0.0582    0.5126  82.8935
N17     0.3192  0.8478    0.0616    0.5221  82.7977
N18     0.3192  0.8496    0.0640    0.5313  82.7898
N19     0.3235  0.8478    0.0674    0.5518  82.3254
N20     0.3261  0.8464    0.0671    0.5620  82.0404
N01.wm  0.4227  0.7139    0.0254    0.3973  69.8223
N02.wm  0.3711  0.7734    0.0476    0.4614  76.7388
N03.wm  0.3375  0.8102    0.0385    0.4088  80.7572
N04.wm  0.3272  0.8213    0.0346    0.4433  81.9233
N05.wm  0.3144  0.8348    0.0298    0.4205  83.3045
N06.wm  0.3077  0.8435    0.0371    0.4253  84.0112
N07.wm  0.3148  0.8377    0.0451    0.4250  83.2681
N08.wm  0.3049  0.8483    0.0407    0.4206  84.2968
N09.wm  0.3035  0.8500    0.0408    0.4205  84.4410
N10.wm  0.3005  0.8546    0.0442    0.4180  84.7506
N11.wm  0.3025  0.8526    0.0442    0.4367  84.5499
N12.wm  0.3096  0.8464    0.0473    0.4342  83.8127
N13.wm  0.3082  0.8501    0.0509    0.4436  83.9594
N14.wm  0.3106  0.8490    0.0529    0.4648  83.7114
N15.wm  0.3111  0.8496    0.0514    0.4804  83.6562
N16.wm  0.3134  0.8488    0.0541    0.4981  83.4109
N17.wm  0.3138  0.8504    0.0570    0.5061  83.3681
N18.wm  0.3136  0.8523    0.0591    0.5142  83.3879
N19.wm  0.3170  0.8510    0.0619    0.5308  83.0317
N20.wm  0.3190  0.8500    0.0615    0.5398  82.8168

rioja documentation built on Oct. 28, 2020, 5:07 p.m.

Related to MAT in rioja...