buildMod | R Documentation |
This is the main function to apply a gbm::gbm()
model to a data set.
buildMod(
input_data,
vars = c("trend", "ws", "wd", "hour", "weekday", "air_temp"),
pollutant = "nox",
sam.size = nrow(input_data),
n.trees = 200,
shrinkage = 0.1,
interaction.depth = 5,
bag.fraction = 0.5,
n.minobsinnode = 10,
cv.folds = 0,
simulate = FALSE,
B = 100,
n.core = 4,
seed = 123,
type = "PSOCK"
)
input_data |
Data frame to analyse. Must contain a POSIXct field called
|
vars |
Explanatory variables to use. These variables will be used to
build the |
pollutant |
The name of the variable to apply meteorological normalisation to. |
sam.size |
The number of random samples to extract from the data for
model building. While it is possible to use the full data set, for data
sets spanning years the model building can take a very long time to run.
Additionally, there will be diminishing returns in terms of model accuracy.
If |
n.trees |
Number of trees to fit. |
shrinkage |
A shrinkage parameter applied to each tree in the expansion.
Also known as the learning rate or step-size reduction; |
interaction.depth |
Integer specifying the maximum depth of each tree
(i.e., the highest level of variable interactions allowed). A value of |
bag.fraction |
The fraction of the training set observations randomly
selected to propose the next tree in the expansion. This introduces
randomness into the model fit. If |
n.minobsinnode |
Integer specifying the minimum number of observations in the terminal nodes of the trees. Note that this is the actual number of observations, not the total weight. |
cv.folds |
Number of cross-validation folds to perform. If |
simulate |
Should the original time series be randomly sampled with
replacement? The default is |
B |
Number of bootstrap simulations for partial dependence plots. |
n.core |
Number of cores to use for parallel processing. |
seed |
Random number seed for reproducibility in returned model. |
type |
One of the supported parallelisation types. See
|
Returns a list including the model, influence data frame and partial dependence data frame.
David Carslaw
testMod()
for testing models before they are built.
metSim()
for using a built model with meteorological simulations.
plot2Way()
, plotInfluence()
and plotPD()
for visualising built
models.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.