View source: R/CreateSpacetimeFolds.R
CreateSpacetimeFolds | R Documentation |
Create spatial, temporal or spatio-temporal Folds for cross validation based on pre-defined groups
CreateSpacetimeFolds(
x,
spacevar = NA,
timevar = NA,
k = 10,
class = NA,
seed = sample(1:1000, 1)
)
x |
data.frame containing spatio-temporal data |
spacevar |
Character indicating which column of x identifies the spatial units (e.g. ID of weather stations) |
timevar |
Character indicating which column of x identifies the temporal units (e.g. the day of the year) |
k |
numeric. Number of folds. If spacevar or timevar is NA and a leave one location out or leave one time step out cv should be performed, set k to the number of unique spatial or temporal units. |
class |
Character indicating which column of x identifies a class unit (e.g. land cover) |
seed |
numeric. See ?seed |
The function creates train and test sets by taking (spatial and/or temporal) groups into account.
In contrast to nndm
, it requires that the groups are already defined (e.g. spatial clusters or blocks or temporal units).
Using "class" is helpful in the case that data are clustered in space
and are categorical. E.g This is the case for land cover classifications when
training data come as training polygons. In this case the data should be split in a way
that entire polygons are held back (spacevar="polygonID") but at the same time the distribution of classes
should be similar in each fold (class="LUC").
A list that contains a list for model training and a list for model validation that can directly be used as "index" and "indexOut" in caret's trainControl function
Standard k-fold cross-validation can lead to considerable misinterpretation in spatial-temporal modelling tasks. This function can be used to prepare a Leave-Location-Out, Leave-Time-Out or Leave-Location-and-Time-Out cross-validation as target-oriented validation strategies for spatial-temporal prediction tasks. See Meyer et al. (2018) for further information.
Hanna Meyer
Meyer, H., Reudenbach, C., Hengl, T., Katurji, M., Nauß, T. (2018): Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environmental Modelling & Software 101: 1-9.
trainControl
,ffs
, nndm
## Not run:
data(cookfarm)
### Prepare for 10-fold Leave-Location-and-Time-Out cross validation
indices <- CreateSpacetimeFolds(cookfarm,"SOURCEID","Date")
str(indices)
### Prepare for 10-fold Leave-Location-Out cross validation
indices <- CreateSpacetimeFolds(dat,spacevar="SOURCEID")
str(indices)
### Prepare for leave-One-Location-Out cross validation
indices <- CreateSpacetimeFolds(dat,spacevar="SOURCEID",
k=length(unique(dat$SOURCEID)))
str(indices)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.