ldknn: Run the ldknn algorithm

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/ldknn.R

Description

The ldknn algorithm is used to detect bias in the composition of a lexical decison task, using k-nearest neighbor classification and the Levenshtein distance metric.

Usage

1
ldknn(stimuli, types, reference, k = 1, method='levenshtein', parallel = FALSE)

Arguments

stimuli

character strings corresponding to the stimuli in the experiment.

types

factor corresponding to the type of each stimulus in the experiment.

reference

a character string giving the reference level. Must be a level of the factor in types

k

a value for the k parameter. Set to 1 by default.

method
  • "levenshtein": uses levenshtein.distance to calculate distances

  • 'levenshtein.damerau': uses levenshtein.damerau.distance to calculate distances

parallel

with parallel=TRUE, ldknn will run in parallel an multiple cores. The number of parallel processes is specified by detectCores(logical = FALSE).

Details

Combining k nearest neighbor classification with the Levenshtein distance produces an algorithm which can be described as follows. For an experiment containing a number of stimuli, which can be words or nonwords:

  1. Compute the Levenshtein distances between the currently presented stimulus and all previously presented stimuli.

  2. Identify the previously presented stimuli that are at the k nearest distances from the current stimulus.

  3. Compute the probability of a word response for the given stimulus based on the relative frequency of words among the nearest neighbors.

Value

A list with class ldknn.run.

data

A data frame containing the results of the run. stimulus gives the stimulus values, type gives the types of the stimuli, p gives the probability for a reference.level response for that stimulus.

reference level

The reference level used for the simulation.

Odds

The odds, z value, and p value for a reference level response, resulting from a logistic regression in which the probabilities generated by the ldknn algorithm are used to predict stimulus types.

plot and print methods are available for objects of class ld1nn.run

Author(s)

Emmanuel Keuleers

References

Keuleers, E., & Brysbaert, M. (2011). Detecting inherent bias in lexical decision experiments with the LD1NN algorithm. The Mental Lexicon, 6(1), 34<e2><80><93>52.

See Also

levenshtein.distance, levenshtein.damerau.distance

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data(english.words)
data(basque.words)
# set up a mock experiment: English stimuli are words, Basque stimuli are nonwords
experiment<-data.frame(stimulus=c(sample(english.words,500),
 sample(basque.words,500)),
 type=factor(rep(c('Word','Nonword'),each=500),levels=c('Word','Nonword')))
# randomize the trials
experiment<-experiment[sample(1:1000,1000),]
# run the ldknn algorithm
results<-ldknn(experiment$stimulus,experiment$type,'Word')
print(results)
plot(results)

vwr documentation built on May 2, 2019, 4:23 a.m.