KnnDistCV: K-Nearest Neighbour correct cross-validation with distance...

Description Usage Arguments Details Value Author(s)

View source: R/KNNCrossValidation_functions.R

Description

This function takes a square matrix of distances among specimens of known group membership and returns the results of a leave-one-out correct cross validation identification for each specimen to provide a correct cross-validation percentage.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
KnnDistCV(
  DistMat,
  GroupMembership,
  K,
  Equal = TRUE,
  EqualIter = 100,
  SampleSize = NA,
  TieBreaker = c("Random", "Remove", "Report"),
  Verbose = FALSE,
  IgnorePrompts = FALSE
)

Arguments

DistMat

is a square matrix of pairwise distances among all reference specimens.

GroupMembership

a character or factor vector in the same order as the distance data to denote group membership.

K

is the number of nearest neighbours that the method will use for assigning group classification.

Equal

indicates where groups should be sampled to equal sample size

EqualIter

sets the number of iterations resampling to equal sample size will be carried out.

SampleSize

is the sample number that groups will be subsampled to if Equal is set to TRUE. The default is set to NA and will therefore use the smallest sample size of the groups provided.

TieBreaker

is the method used to break ties if there is no majority resulting from K. Three methods are available('Random', 'Remove' and 'Report'): Random randomly returns one of tied classifications; Remove returns 'UnIDed' for the classification; Report returns a the multiple classifications as a single character string with tied classifications separated by '_'. NOTE: for correct cross-validation proceedures the results of both Report will be considered an incorrect identification even if one of the multiple reported classifications is correct.

Verbose

determines whether the cross-validation results for each reference specimen is returned. Note that if this is set to TRUE and Equal is set to TRUE the funtion will return a list with the results of each iteration which will slow the process dramatically and take a lot of local memory.

IgnorePrompts

if both Verbose and Equal are set to TRUE, then the funciton will ask if you are sure you wish to continue; setting IgnorePrompts to TRUE will ignore this question.

Details

The function also provides functionality to resample unequal groups to equal sample size a set number of times.

This function applies both a weighted approach and an unweighted approach and returns both results.

Value

Returns a matrix of the leave-one-out classifications for all the specimens along with their known classificaiton.

Author(s)

Ardern Hulme-Beaman


ArdernHB/KnnDist documentation built on Feb. 5, 2021, 5:09 a.m.