kNN.classification: k Nearest Neighbours Classification
In hkauhanen/pbcm: Parametric Bootstrap Cross-Fitting Method

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/kNN.classification.R

Carry out k Nearest Neighbours (k-NN) classification on the results of a parametric boostrap.

1	kNN.classification(df, DeltaGoF.emp, k, ties = "model2", verbose = TRUE)

`df`	Results of bootstrap; the output of `pbcm.di` or `pbcm.du`
`DeltaGoF.emp`	Empirical value of goodness of fit (e.g. from `empirical.GoF`)
`k`	Number of neighbours to employ in classification; may be a vector of integers
`ties`	Which way should ties (when distance to the two distributions is equal) be broken? By default, we break in favour of model 2, taking this to be the null model in the comparison.
`verbose`	If `TRUE`, warnings are issued to the console

Calculates the cumulative distance (sum of squared differences) of DeltaGoF.emp to both DeltaGoF distributions found in df (i.e. one with model 1 as generator and one with model 2 as generator), taking into account the k nearest neighbours only. Decides in favour of model 1 if this cumulative distance to the model 1 distribution is smaller than than the distance to model 2, and vice versa. If distances are equal, decision is made according to the ties argument.

A data frame containing the computed distances and decisions, one row per each value of k

Henri Kauhanen

Schultheis, H. & Singhaniya, A. (2015) Decision criteria for model comparison using the parametric bootstrap cross-fitting method. Cognitive Systems Research, 33, 100–121. https://doi.org/10.1016/j.cogsys.2014.09.003

empirical.GoF, pbcm.di, pbcm.du

x <- seq(from=0, to=1, length.out=100)
mockdata <- data.frame(x=x, y=x + rnorm(100, 0, 0.5))

myfitfun <- function(data, p) {
  res <- nls(y~a*x^p, data, start=list(a=1.1))
  list(a=coef(res), GoF=deviance(res))
}

mygenfun <- function(model, p) { 
  x <- seq(from=0, to=1, length.out=100)
  y <- model$a*x^p + rnorm(100, 0, 0.5)
  data.frame(x=x, y=y)
}

pb <- pbcm.di(data=mockdata, fun1=myfitfun, fun2=myfitfun, genfun1=mygenfun,
        genfun2=mygenfun, reps=20, args1=list(p=1), args2=list(p=2), 
        genargs1=list(p=1), genargs2=list(p=2))

emp <- empirical.GoF(mockdata, fun1=myfitfun, fun2=myfitfun,
                     args1=list(p=1), args2=list(p=2))

kNN.classification(df=pb, DeltaGoF.emp=emp$DeltaGoF, k=c(10, 20))