The Ratcliff Diffusion Model
Description
Density, distribution function, quantile function, and random generation for the Ratcliff diffusion model with following parameters: a
(threshold separation), z
(starting point), v
(drift rate), t0
(nondecision time/response time constant), d
(differences in speed of response execution), sv
(intertrialvariability of drift), st0
(intertrialvariability of nondecisional components), sz
(intertrialvariability of relative starting point), and s
(diffusion constant). Note that the parameterization or defaults of nondecision time variability st0
and diffusion constant s
differ from what is often found in the literature and that the parameterization of z
and sz
have changed compared to previous versions (now absolute and not relative).
Usage
1 2 3 4 5 6 7 8 9 10 11 12  ddiffusion(rt, response = "upper", a, v, t0, z = 0.5 * a, d = 0, sz = 0,
sv = 0, st0 = 0, s = 1, precision = 3)
pdiffusion(rt, response = "upper", a, v, t0, z = 0.5 * a, d = 0, sz = 0,
sv = 0, st0 = 0, s = 1, precision = 3, maxt = 10000)
qdiffusion(p, response = "upper", a, v, t0, z = 0.5 * a, d = 0, sz = 0,
sv = 0, st0 = 0, s = 1, precision = 3, maxt = 10000,
interval = c(0, 10), scale_p = FALSE, scale_max = Inf)
rdiffusion(n, a, v, t0, z = 0.5 * a, d = 0, sz = 0, sv = 0, st0 = 0,
s = 1, precision = 3)

Arguments
rt 
a vector of RTs. Or for convenience also a 
response 
character vector. Which response boundary should be tested? Possible values are 
a 
threshold separation. Amount of information that is considered for a decision. Large values indicate a conservative decisional style. Typical range: 0.5 < 
v 
drift rate. Average slope of the information accumulation process. The drift gives information about the speed and direction of the accumulation of information. Large (absolute) values of drift indicate a good performance. If received information supports the response linked to the upper threshold the sign will be positive and vice versa. Typical range: 5 < 
t0 
nondecision time or response time constant (in seconds). Lower bound for the duration of all nondecisional processes (encoding and response execution). Typical range: 0.1 < 
z 
starting point. Indicator of an a priori bias in decision making. When the relative starting point 
d 
differences in speed of response execution (in seconds). Positive values indicate that response execution is faster for responses linked to the upper threshold than for responses linked to the lower threshold. Typical range: 0.1 < 
sz 
intertrialvariability of starting point. Range of a uniform distribution with mean 
sv 
intertrialvariability of drift rate. Standard deviation of a normal distribution with mean 
st0 
intertrialvariability of nondecisional components. Range of a uniform distribution with mean 
s 
diffusion constant; standard deviation of the random noise of the diffusion process (i.e., withintrial variability), scales 
precision 

maxt 
maximum 
p 
vector of probabilities. Or for convenience also a 
interval 
a vector containing the endpoints of the interval to be searched for the desired quantiles (i.e., RTs) in 
scale_p 
logical. Should entered probabilities automatically be scaled by maximally predicted probability? Default is 
scale_max 
numerical scalar. Value at which maximally predicted RT should be calculated if 
n 
is a desired number of observations. 
Details
The Ratcliff diffusion model (Ratcliff, 1978) is a mathematical model for twochoice discrimination tasks. It is based on the assumption that information is accumulated continuously until one of two decision thresholds is hit. For more information, see Voss, Rothermund, and Voss (2004), Voss, Nagler, and Lerche (2013), or Wagenmakers (2009).
All functions are fully vectorized across all parameters as well as the response. This allows for trialwise parameters for each parameter.
For convenience, all functions (with the exception of rdiffusion
) allow that the first argument is a data.frame
containing the information of the first and second argument in two columns (i.e., rt
/p
and response
). Other columns will be ignored. This allows, for example, to pass the data.frame
generated by rdiffusion
directly to pdiffusion
. See examples.
Distribution Function (CDF)
For the time being, the cumulative probability function (pdiffusion
) uses numerical integration of ddiffusion
via integral
from package pracma. While this is somewhat slow, it reliably produces correct results. Please report all issues with the CDF to the package maintainer.
Quantile Function
Due to the bivariate nature of the diffusion model, the diffusion processes reaching each response boundary only return the defective CDF that does not reach 1. Only the sum of the CDF for both boundaries reaches 1. Therefore, qdiffusion
can only return quantiles/RTs for any accumulator up to the maximal probability of that accumulator's CDF. This can be obtained by evaluating the CDF at Inf
.
As a conveniece for the user, if scale_p = TRUE
in the call to qdiffusion
the desired probabilities are automatically scaled by the maximal probability for the corresponding response. Note that this can be slow as the maximal probability is calculated separately for each desired probability. See examples.
Also note that quantiles (i.e., predicted RTs) are obtained by numerically minimizing the absolute difference between desired probability and the value returned from pdiffusion
using optimize
. If the difference between the desired probability and probability corresponding to the returned quantile is above a certain threshold (currently 0.0001) no quantile is returned but NA
. This can be either because the desired quantile is above the maximal probability for this accumulator or because the limits for the numerical integration are too small (default is c(0, 10)
).
Value
ddiffusion
gives the density, pdiffusion
gives the distribution function, qdiffusion
gives the quantile function (i.e., predicted RTs), and rdiffusion
generates random response times and decisions (returning a data.frame
with columns rt
(numeric) and response
(factor)).
The length of the result is determined by n
for rdiffusion
, equal to the length of rt
for ddiffusion
and pdiffusion
, and equal to the length of p
for qdiffusion
.
The distribution parameters (as well as response
) are recycled to the length of the result. In other words, the functions are completely vectorized for all parameters and even the response boundary.
Note
The parameterization of the nondecisional components, t0
and st0
, differs from the parameterization used by, for example, Andreas Voss or Roger Ratcliff in that t0
is not the lower bound of the uniform distribution of length st0
, but its midpoint.
The default diffusion constant s
is 1 and not 0.1 as in most applications of Roger Ratcliff and others.
We have changed the parameterization of the start point z
which is now the absolute start point (it was the relative start point in previous versions of rtdists).
RTs need to be sorted (in increasing order) for pdiffusion
.
Author(s)
Underlying C code by Jochen Voss and Andreas Voss. Porting and R wrapping by Matthew Gretton, Andrew Heathcote, Scott Brown, and Henrik Singmann. qdiffusion
by Henrik Singmann.
References
Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85(2), 59108.
Voss, A., Rothermund, K., & Voss, J. (2004). Interpreting the parameters of the diffusion model: An empirical validation. Memory & Cognition. Vol 32(7), 32, 12061220.
Voss, A., Nagler, M., & Lerche, V. (2013). Diffusion Models in Experimental Psychology: A Practical Introduction. Experimental Psychology, 60(6), 385402. doi:10.1027/16183169/a000218
Wagenmakers, E.J., van der Maas, H. L. J., & Grasman, R. P. P. P. (2007). An EZdiffusion model for response time and accuracy. Psychonomic Bulletin & Review, 14(1), 322.
Wagenmakers, E.J. (2009). Methodological and empirical developments for the Ratcliff diffusion model of response times and accuracy. European Journal of Cognitive Psychology, 21(5), 641671.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137  ## identical calls (but different random values)
rt1 < rdiffusion(500, a=1, v=2, t0=0.5)
head(rt1)
rt2 < rdiffusion(500, a=1, v=2, t0=0.5, d=0, sz=0, sv=0, st0=0)
head(rt2)
# get density for random RTs (possible to specify arguments for pdiffusion in same way):
sum(log(ddiffusion(rt1$rt, rt1$response, a=1, v=2, t0=0.5))) # response is factor
sum(log(ddiffusion(rt1$rt, as.numeric(rt1$response), a=1, v=2, t0=0.5))) # response is numeric
sum(log(ddiffusion(rt1$rt, as.character(rt1$response), a=1, v=2, t0=0.5))) # response is character
sum(log(ddiffusion(rt1, a=1, v=2, t0=0.5))) # response is data.frame
sum(log(ddiffusion(rt2$rt, rt2$response, a=1, v=2, t0=0.5)))
# can we recover the parameters?
ll_diffusion < function(pars, rt, response)
{
densities < tryCatch(
ddiffusion(rt, response=response,
a=pars[1], v=pars[2], t0=pars[3],
sz=pars[4],
st0=pars[5], sv=pars[6]),
error = function(e) 0)
if (any(densities == 0)) return(1e6)
return(sum(log(densities)))
}
## Not run:
start < c(runif(2, 0.5, 3), 0.1, runif(3, 0, 0.5))
names(start) < c("a", "v", "t0", "sz", "st0", "sv")
recov < nlminb(start, ll_diffusion, lower = 0, rt=rt1$rt, response=rt1$response)
round(recov$par, 3)
# a v t0 sz st0 sv
#0.954 1.764 0.503 0.000 0.000 0.000
## End(Not run)
## Not run:
## replicate Table 1 from Wagenmakers et al. (2007) using rdiffusion:
n < 1e5 # number of samples
# take parameter valeus from Table 2 and set s to 0.1
george < rdiffusion(n, a = 0.12, v = 0.25, t0 = 0.3, s = 0.1)
rich < rdiffusion(n, a = 0.12, v = 0.25, t0 = 0.25, s = 0.1)
amy < rdiffusion(n, a = 0.08, v = 0.25, t0 = 0.3, s = 0.1)
mark < rdiffusion(n, a = 0.08, v = 0.25, t0 = 0.25, s = 0.1)
george$id < "george"
rich$id < "rich"
amy$id < "amy"
mark$id < "mark"
wag < rbind(george, rich, amy, mark)
wag$id < factor(wag$id, levels = c("george", "rich", "amy", "mark"))
opt < options()
options(digits = 3)
aggregate(cbind(rt, as.numeric(response)1) ~ id, wag, mean)
# id rt V2
# 1 george 0.517 0.952
# 2 rich 0.467 0.953
# 3 amy 0.422 0.881
# 4 mark 0.372 0.882
options(digits = 1)
aggregate(rt ~ id, wag, var)
# id rt
# 1 george 0.024
# 2 rich 0.024
# 3 amy 0.009
# 4 mark 0.009
options(opt)
## End(Not run)
## plot density:
curve(ddiffusion(x, a=1, v=2, t0=0.5, response = "upper"),
xlim=c(0,3), main="Density of upper responses", ylab="density", xlab="response time")
curve(ddiffusion(x, a=1, v=2, t0=0.5, st0=0.2, response = "upper"),
add=TRUE, lty = 2)
legend("topright", legend=c("no", "yes"), title = "Starting Point Variability?", lty = 1:2)
# plot cdf:
curve(pdiffusion(x, a=1, v=2, t0=0.5, st0=0.2, response="u"),
xlim = c(0, 3),ylim = c(0,1),
ylab = "cumulative probability", xlab = "response time",
main = "CDF of diffusion model with start point variability")
curve(pdiffusion(x, a=1, v=2, t0=0.5, st0=0.2, response="l"),
add=TRUE, lty = 2)
legend("topleft", legend=c("upper", "lower"), title="response boundary", lty=1:2)
## Not run:
### qdiffusion can only return values up to maximal predicted probability:
(max_p < pdiffusion(Inf, a=1, v=2, t0=0.5, st0=0.2, sz = 0.1, sv = 0.5, response="u"))
# [1] 0.87
# (Note that with the current integration routine for pdiffusion use Inf and not smaller values.)
qdiffusion(0.87, a=1, v=2, t0=0.5, st0=0.2, sz = 0.1, sv = 0.5, response="u")
# [1] 1.945802
qdiffusion(0.88, a=1, v=2, t0=0.5, st0=0.2, sz = 0.1, sv = 0.5, response="u")
# NA with warning.
# to get predicted quantiles, scale required quantiles by maximally predicted response rate:
qs < c(.1, .3, .5, .7, .9)
qdiffusion(qs*max_p, a=1, v=2, t0=0.5, st0=0.2, sz = 0.1, sv = 0.5, response="u")
# or set scale_p to TRUE which scales automatically by maximum p
# (but can be slow as it calculates max_p for each probability separately)
qdiffusion(qs, a=1, v=2, t0=0.5, st0=0.2, sz = 0.1, sv = 0.5, response="u", scale_p = TRUE)
# qdiffusion also accepts a data.frame as first argument:
t3 < data.frame(p = rep(c(0.05, 0.1, 0.87), 2), response = rep(c("upper", "lower"), each = 3))
# p response
# 1 0.05 upper
# 2 0.10 upper
# 3 0.87 upper
# 4 0.05 lower
# 5 0.10 lower
# 6 0.87 lower
qdiffusion(t3, a=1, v=2, t0=0.5, st0=0.2, sz = 0.1, sv = 0.5)
## End(Not run)
## LBA and diffusion can be used interchangeably:
rt1 < rLBA(500, A=0.5, b=1, t0 = 0.5, mean_v=c(2.4, 1.6), sd_v=c(1,1.2))
rt2 < rdiffusion(500, a=1, v=2, t0=0.5)
# data can also be passed as data.frame (same is true for pLBA):
sum(log(dLBA(rt1, A=0.5, b=1, t0 = 0.5, mean_v=c(2.4, 1.6), sd_v=c(1,1.2))))
sum(log(dLBA(rt2, A=0.5, b=1, t0 = 0.5, mean_v=c(2.4, 1.6), sd_v=c(1,1.2))))
sum(log(ddiffusion(rt1, a=1, v=2, t0=0.5)))
sum(log(ddiffusion(rt2, a=1, v=2, t0=0.5)))
