R/cpdetectoR.R

#' cpdetectoR: Change point estimation in learning curves
#'
#' The code is from Gallistel et al. (2004), translated from Matlab to R, with large
#' portions of comments and original code preserved. The package consists of a few
#' internal functions and one wrapper function, cp_wrapper.
#'
#' @docType package
#' @name cpdetectoR

NULL

#' Data on rabbit eyeblink conditioning
#'
#' Data from Gallistel et al. (2004). Original data description: "[Data] are binary,
#' 1 or 0, according as a conditioned blink did or did not occur on a trial. They are
#' discrete-trial data, so the cpd function must be used to find putative change points.
#' They are frequency data, so one might try using the chisquare test. However, these are
#' the kind of data that cause the chisquare test to run into computational difficulties.
#' The random rate test works better with these data."
#'
#' @source \url{http://www.pnas.org/content/suppl/2004/08/31/0404965101.DC1/04965DataSet5.txt/}
"eyeblink"

#' Data on correct choices in a + maze
#'
#' Data from Gallistel et al. (2004). Original data description: "These are
#' frequencies, so, in principle, a chisquare test is appropriate to test for changes in
#' the expected frequency of correct choice. However, for some frequency data sets, the
#' chisquare test runs into computational difficulties. The chisquare computation is not
#' valid unless the expected number of observations in the cell with the smallest
#' expectations is 5 or greater. The function tests for this, and, if this condition is
#' not met, then it uses the Fisher's exact test. However, Fisher's exact test uses
#' factorials, and these can become intractably large. This happens in data where the
#' frequency before a change is already high, say 0.85, and it becomes even higher after
#' the change. Under these conditions, the numbers of observations in the more populous
#' cells become large and the factorials intractable. One can also treat these data as
#' generated by a random rate process that has a certain probability of generating a
#' correct choice on any given trial. In that case, one would use the random rate test.
#' This test is more computationally robust (less likely to run afoul of computational
#' problems) than the chi square test and should be used whenever the chi square test
#' fails for computational reasons."
#'
#' @source \url{http://www.pnas.org/content/suppl/2004/08/31/0404965101.DC1/04965DataSet3.txt/}
"plusmaze"

#' Data on interreward intervals in a matching experiment with concurrent variable
#' interval schedules
#'
#' Data from Gallistel et al. (2004). Original data description: "These are an example of
#' a continuous-time data record: the successive entries are the durations of the
#' successive interreward intervals. Thus, the putative change points must be found by
#' [setting isDiscrete = FALSE]. The interevent intervals are approximately exponentially
#' distributed [...]. Thus, the process approximates a random rate process and the random
#' rate logit method is appropriate [...]."
#'
#' @source \url{http://www.pnas.org/content/suppl/2004/08/31/0404965101.DC1/04965DataSet1.txt/}
"matching"

#' Data on successive hopper-entry speeds
#'
#' Data from Gallistel et al. (2004). Original data description: "A hopper entry speed is
#' the reciprocal of the latency between the rise of the grain bin into the feeding hopper
#' and the entry of the pigeon's head into the hopper. These data are an example of a
#' discrete-time data record. [...] The entry speeds are approximately normally
#' distributed [... t]hus, the the t-test is appropriate for testing for a change in the
#' mean entry latency."
#'
#' @source \url{http://www.pnas.org/content/suppl/2004/08/31/0404965101.DC1/04965DataSet2.txt/}
"hopperentry"

#' Data on feeding-by-feeding preference scores in a mouse matching experiment
#'
#' Data from Gallistel et al. (2004). Original data description: "The side-preference
#' score during any one interfeeding interval is the difference between the amounts of
#' time spent at each feeding hopper divided by their sum. These are discrete-trial,
#' real-valued measures [...] They do not obey any standard distribution,
#' so the Kolmogorov-Smirnov test is appropriate for comparing the distributions before
#' and after a putative change point."
#'
#' @source \url{http://www.pnas.org/content/suppl/2004/08/31/0404965101.DC1/04965DataSet4.txt/}
"feedingpref"

#' Data on swim efficiencies in a rat learning a water maze
#'
#' Data from Gallistel et al. (2004). Original data description: "The swim efficiency is
#' the straight-line distance between where the rat is placed in the tank and the location
#' of the platform divided by the distance actually swum in reaching the platform. These
#' are again discrete-trial data. The measures can fall anywhere in the interval from 0 to
#' 1. They do not appear to be normally distributed, so one might want to use the K-S
#' statistic. However, there are only 32 trials (data) and the K-S test requires a minimum
#' of 4 data in each sample (before and after a putative change point), so one might also
#' want to try the t test on these data."
#'
#' @source \url{http://www.pnas.org/content/suppl/2004/08/31/0404965101.DC1/04965DataSet6.txt/}
"watermaze"
ontogenerator/cpdetectoR documentation built on May 14, 2019, 1:59 a.m.