superClass | R Documentation |
Supervised classification both for classification and regression mode based on vector training data (points or polygons).
superClass(
img,
trainData,
valData = NULL,
responseCol = NULL,
nSamples = 1000,
nSamplesV = 1000,
polygonBasedCV = FALSE,
trainPartition = NULL,
model = "rf",
tuneLength = 3,
kfold = 5,
minDist = 2,
mode = "classification",
predict = TRUE,
predType = "raw",
filename = NULL,
verbose,
overwrite = TRUE,
...
)
img |
SpatRaster. Typically remote sensing imagery, which is to be classified. |
trainData |
sf or sp spatial vector data containing the training locations (POINTs,or POLYGONs). |
valData |
Ssf or sp spatial vector data containing the validation locations (POINTs,or POLYGONs) (optional). |
responseCol |
Character or integer giving the column in |
nSamples |
Integer. Number of samples per land cover class. If |
nSamplesV |
Integer. Number of validation samples per land cover class. If |
polygonBasedCV |
Logical. If |
trainPartition |
Numeric. Partition (polygon based) of |
model |
Character. Which model to use. See train for options. Defaults to randomForest ('rf'). In addition to the standard caret models, a maximum likelihood classification is available via |
tuneLength |
Integer. Number of levels for each tuning parameter (see train for details). |
kfold |
Integer. Number of cross-validation resamples during model tuning. |
minDist |
Numeric. Minumum distance between training and validation data,
e.g. |
mode |
Character. Model type: 'regression' or 'classification'. |
predict |
Logical. Produce a map (TRUE, default) or only fit and validate the model (FALSE). |
predType |
Character. Type of the final output raster. Either "raw" for class predictions or "prob" for class probabilities. Class probabilities are not available for all classification models (predict.train). |
filename |
Path to output file (optional). If |
verbose |
Logical. prints progress and statistics during execution |
overwrite |
logical. Overwrite spatial prediction raster if it already exists. |
... |
further arguments to be passed to |
Note that superClass automatically loads the lattice and randomForest package. SuperClass performs the following steps:
Ensure non-overlap between training and validation data. This is neccesary to avoid biased performance estimates.
A minimum distance (minDist
) in pixels can be provided to enforce a given distance between training and validation data.
Sample training coordinates. If trainData
(and valData
if present) are polygons superClass
will calculate the area per polygon and sample
nSamples
locations per class within these polygons. The number of samples per individual polygon scales with the polygon area, i.e. the bigger the polygon, the more samples.
Split training/validation
If valData
was provided (reccomended) the samples from these polygons will be held-out and not used for model fitting but only for validation.
If trainPartition
is provided the trainingPolygons will be divided into training polygons and validation polygons.
Extract raster data
The predictor values on the sample pixels are extracted from img
Fit the model. Using caret::train on the sampled training data the model
will be fit,
including parameter tuning (tuneLength
) in kfold
cross-validation. polygonBasedCV=TRUE
will define cross-validation folds based on polygons (reccomended)
otherwise it will be performed on a per-pixel basis.
Predict the classes of all pixels in img
based on the final model.
Validate the model with the independent validation data.
A superClass object (effectively a list) containing:
$model: the fitted model
$modelFit: model fit statistics
$training: indexes of samples used for training
$validation: list of
$performance: performance estimates based on independent validation (confusion matrix etc.)
$validationSamples: actual pixel coordinates plus reference and predicted values used for validation
$validationGeometry: validation polygpns (clipped with mindist to training geometries)
$map: the predicted raster
$classMapping: a data.frame containing an integer <-> label mapping
train
library(RStoolbox)
library(caret)
library(randomForest)
library(e1071)
library(terra)
train <- readRDS(system.file("external/trainingPoints_rlogo.rds", package="RStoolbox"))
## Plot training data
olpar <- par(no.readonly = TRUE) # back-up par
par(mfrow=c(1,2))
colors <- c("yellow", "green", "deeppink")
plotRGB(rlogo)
plot(train, add = TRUE, col = colors[train$class], pch = 19)
## Fit classifier (splitting training into 70\% training data, 30\% validation data)
SC <- superClass(rlogo, trainData = train, responseCol = "class",
model = "rf", tuneLength = 1, trainPartition = 0.7)
SC
## Plots
plot(SC$map, col = colors, legend = FALSE, axes = FALSE, box = FALSE)
legend(1,1, legend = levels(train$class), fill = colors , title = "Classes",
horiz = TRUE, bty = "n")
par(olpar) # reset par
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.