Description Usage Arguments Value Note Examples
ffs_train is a wrapper function for a simple use of the forward feature selection approach of training random forest classification models. This validation is particulary suitable for leave-location-out cross validations where variable selection MUST be based on the performance of the model on the hold out station. See Meyer et al. (2018) for further details. This is in fact the case while using time space variable vegetation patterns for classification purposes. For the UAV based RGB/NIR imagery, it provides an optimized preconfiguration for the classification goals.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ffs_train(
trainingDF = NULL,
predictors = c("R", "G", "B"),
response = "ID",
spaceVar = "FN",
names = c("ID", "R", "G", "B", "A", "FN"),
noLoc = NULL,
sumFunction = "twoClassSummary",
pVal = 0.5,
prefin = "final_",
preffs = "ffs_",
modelSaveName = "model.RData",
runtest = FALSE,
seed = 100,
withinSE = TRUE,
mtry = 2,
noClu = 1
)
|
trainingDF |
dataframe. containing training data |
predictors |
character. vector of predictor names as given by the header of the training data table |
response |
character. name of response variable as given by the header of the training data table |
spaceVar |
character. name of the spacetime splitting vatiable as given by the header of the training data table |
names |
character. all names of the dataframe header |
noLoc |
numeric. number of locations to leave out usually number of discrete trainings locations/images |
sumFunction |
character. function to summarize default is "twoClassSummary" |
pVal |
numeric. used part of the training data default is |
prefin |
character. name pattern used for model default is |
preffs |
character. name pattern used for ffs default is |
modelSaveName |
character. name pattern used for saving the model default is |
runtest |
logical. default is false, if set a external validation will be performed |
seed |
numeric. number for seeding |
withinSE |
locical. compares the performance to models that use less variables (e.g. if a model using 5 variables is better than a model using 4 variables but still in the standard error of the 4-variable model, then the 4-variable model is rated as the better model). |
mtry |
numerical. Number of variable is randomly collected to be sampled at each split time |
noClu |
numeric. number of cluster to be used |
model of a forward feature selection driven random forest classification
The workflow of uavRst
is intended to use the forward feature selection as decribed by Meyer et al. (2018).
This approach needs at least a pair of images that differ in time and/or space for a leave one location out validation mode. You may overcome this situation if you tile your image and provide for each tile seperate training data.
If you just want to classify a single image by a single training file use the normal procedure as provided by the trainControl
function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | ## Not run:
require(uavRst)
##- project folder
projRootDir<-tempdir()
# create subfolders please mind that the pathes are exported as global variables
paths<-link2GI::initProj(projRootDir = projRootDir,
projFolders = c("data/","data/ref/","output/","run/","las/"),
global = TRUE,
path_prefix = "path_")
setwd(path_run)
unlink(file.path(tempdir(),"*"), force = TRUE)
##- get the rgb image, chm and training data
utils::download.file("https://github.com/gisma/gismaData/raw/master/uavRst/data/ffs.zip",
file.path(tempdir(),"ffs.zip"))
unzip(zipfile = file.path(tempdir(),"ffs.zip"),exdir = ".")
##- get geometrical training data assuming that you have used before the calc_ext function
trainDF<-readRDS(file.path(tempdir(),"tutorial_trainDF.rds"))
load(file.path(tempdir(),"tutorialbandNames.RData"))
##- define the classes
idNumber=c(1,2,3)
idNames= c("green tree","yellow tree","no tree")
##- add classes names
for (i in 1:length(idNumber)){
trainDF$ID[trainDF$ID==i]<-idNames[i]
}
##- convert to factor (required by rf)
trainDF$ID <- as.factor(trainDF$ID)
##- now prepare the predictor and response variable names
##- get actual name list from the DF
name<-names(trainDF)
##- cut leading and tailing ID/FN
predictNames<-name[3:length(name)-1]
##- call Training
model <- ffs_train(trainingDF= trainDF,
predictors= predictNames,
response = "ID",
spaceVar = "FN",
names = name,
pVal = 0.1,
noClu = 1)
##- for classification/prediction go ahead with the predict_RGB function
##+
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.