Description Usage Arguments Details Value See Also Examples
This function trains a k-NN model from response variables (Y) and predictors
(X) at reference observations using the package yaImpute (see
yai
). By default, the distance between observations
is obtained from the proximity matrix of random forest regression or
classification trees. Optionally, training and testing sets can be provided
to return the accuracy of the trained k-NN model.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
x |
A dataframe or SpatialPointsDataFrame of predictors variables X for reference observations. Row names of X are used as identification of reference observations. |
y |
A dataframe or SpatialPointsDataFrame of response variables Y for the reference observations. Row names of Y are used as identification of reference observations. |
inTrain |
Optional. A list obtained from
|
inTest |
Optional list indicating which rows of x and y go to validation.
If left NULL, all rows that are not in |
k |
Integer. Number of nearest neighbors |
method |
Character. Which nearness metrics is used to compute the nearest
neighbors. Default is |
impute.cont |
Character. The method used to compute the imputed
continuous variables. Can be |
impute.fac |
Character. The method used to compute the imputed values for
factors. Default value is the same as |
ntree |
Number of classification or regression trees drawn for each response variable. Default is 500 |
mtry |
Number of X variables picked randomly to split each node. Default is sqrt(number of X variables) |
rfMode |
By default, |
... |
Other arguments passed to |
If performing model validation, the function trains a kNN model from the
training set, finds the k NN of the validation set and imputes the response
variables from the k NN. If k = 1
, only the closest NN value is
imputed. If k > 1, the imputed value can be either the closest NN value, the
mean, median or distance weighted mean of the k NN values.This is controlled
by the arguments impute.cont
or impute.fac
.
If inTest = NULL, all rows that are not in inTrain will be used for model testing. If inTrain = NULL, all rows that are not in inTest will be used for model training. If both inTrain and inTest are NULL, all rows of x and y will be used for training and no testing is performed.
The final model returned by findNN
is trained from all observations of
x
and y
.
A list containing the following objects:
model
A yai
object, the trained k-NN model
preds
A data.frame with observed and predicted values of the testing set for each response variables
yai
, newtargets
,
impute.yai
, accuracy
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | # Load data in memory
# X_vars_sample: Predictor variables at sample (from getSample)
# Y_vars_sample: Response variables at sample (from getSample)
# train_idx: Rows of X_vars_sample and Y_vars_sample that are used for
# training (from (partition))
load(system.file("extdata/examples/example_trainNN.RData",package="foster"))
set.seed(1234) #for example reproducibility
kNN <- trainNN(x = X_vars_sample,
y=Y_vars_sample,
inTrain = train_idx,
k = 1,
method = "randomForest",
ntree = 200)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.