bagging | R Documentation |
Creates bagged tree estimates from diet data
bagging(formula, data, weights, subset, na.action = na.dpart,
model = FALSE, x = FALSE, y = TRUE, parms, control,
cost, nBaggs,
spatial = list(fit = FALSE, sizeofgrid = 5,
nsub = NULL, ID = NULL, LonID = "Longitude",
LatID = "Latitude"),
Plot = FALSE,
predID, numCores = 1, ...)
formula |
a formula, with a response but no interaction terms
as for the |
data |
an optional data frame in which to interpret the variables named in the formula |
weights |
case weights |
subset |
optional expression saying that only a subset of the rows of the data should be used in the fit. |
na.action |
The default action deletes all observations for which |
model |
if logical: keep a copy of the model frame in the result? If the input value for model is a model frame (likely from an earlier call to the rpart function), then this frame is used rather than constructing new data. |
x |
keep a copy of the x matrix in the result. |
y |
keep a copy of the dependent variable in the result. If missing and |
parms |
optional parameters for the splitting function. For classification splitting,
the list can contain any of: the vector of prior probabilities (component prior), the loss
matrix (component loss) or the splitting index (component split). The priors must be
positive and sum to 1. The loss matrix must have zeros on the diagonal and positive
off-diagonal elements. The splitting index can be gini or information. The default
priors are proportional to the data counts, the losses default to 1, and the split
defaults to |
control |
options that control details of the |
cost |
a vector of non-negative costs, one for each variable in the model. Defaults to one for all variables. These are scalings to be applied when considering splits, so the improvement on splitting on a variable is divided by its cost in deciding which split to choose. |
nBaggs |
numeric. Number of bootstrap samples. |
spatial |
A list with the following elements: fit = do spatial bootstrapping sizeofgrid = size of spatial tile to sample from (default is 5) nsub = number of sub-samples to take (defaults to no subsampling) ID = ID in which to subsample from (e.g. TripSetPredNo) (only required if sub-sampling is required) |
Plot |
plotting the spatial grid with samples (default: no plotting (FALSE)) |
predID |
predator ID |
numCores |
Number of cores to push the bagging on to. Only available under Unix (default: 1) |
... |
arguments to be passed to or from other methods. |
Users will need to determine whether spatial bootstrapping is required. They can
use the resid
function to examine the residuals from the fit of the
model to determine whether this is required.
A list with the following elements:
baggs |
tree objects for each |
oob |
numeric vector indicating the samples left as out of bag (oob) samples. |
pred.oob |
predicted prey composition for each set of out of bag samples. |
pred |
all predicted prey compositions for each bootstrap sample. |
resid |
data frame of residuals from the fitted tree for each bootstrap sample. |
data |
bootstrap sample dataset |
Kuhnert, P.M., Duffy, L. M and Olson, R.J. (2012) The Analysis of Predator Diet and Stable Isotope Data, Journal of Statistical Software, In Prep.
Kuhnert PM, Kinsey-Henderson A, Bartley R, Herr A (2010) Incorporating uncertainty in gully erosion calculations using the random forests modelling approach. Environmetrics 21:493-509. doi:10.1002/env.999
Breiman L (1996) Bagging predictors. Mach Learn 24:123-140. doi:10.1023/A:1018054314350
Breiman L (1998) Arcing classifiers (with discussion). Ann Stat 26:801-824. doi:10.2307/120055
Breiman L (2001) Random forests. Mach Learn 45:5-32. doi:10.1023/A:1010933404324
# Assigning prey colours for default palette
#val <- apc(x = yftdiet, preyfile = PreyTaxonSort, check = TRUE)
#node.colsY <- val$cols
#dietPP <- val$x # updated diet matrix with Group assigned prey taxa codes
# Bagging
# Bagging with NO spatial bootstrapping
# N.B. Not run as this takes a while
#yft.bag <- bagging(Group ~ Lat + Lon + Year + Quarter + SST + Length,
# data = dietPP, weights = W, minsplit = 50,
# cp = 0.001, nBaggs = 500, predID = "TripSetPredNo")
#
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.