ddm | R Documentation |
Fit the 5-parameter DDM (Diffusion Decision Model) via maximum likelihood estimation. The model for each DDM parameter can be specified symbolically using R's formula interface. With the exception of the drift rate (which is always estimated) all parameters can be either fixed or estimates.
ddm(
drift,
boundary = ~1,
ndt = ~1,
bias = 0.5,
sv = 0,
data,
optim = "nlminb",
args_optim = list(init = NULL, lo_bds = NULL, up_bds = NULL, control = list()),
args_ddm = list(err_tol = 1e-06),
use_gradient = TRUE,
compiled_model = TRUE,
model = TRUE,
mmatrix = TRUE,
response = TRUE,
na.action,
subset,
contrasts = NULL
)
drift |
Two-sided formula. The left-hand side describes the response,
the right-hand side provides a symbolic description of the regression model
underlying the drift rate (v). The left-hand side needs to specify the
column in the data containing response time and corresponding binary
decision, concatenated by |
boundary , ndt , bias , sv |
Either a one-sided formula providing a symbolic description of the regression model or a scalar number given the value this parameter should be fixed to. Boundary separation (a), non-decision time (t0), relative initial bias (w), or inter-trial variability in the drift rate (sv) |
data , na.action , subset |
arguments controlling formula processing via
|
optim |
character string or fitting function indicating which numerical
optimization method should be used. The default |
args_optim |
named list of additional arguments for the optimization function. The available options are:
|
args_ddm |
named list of additional arguments passed to density function
calculation. Currently, the only option for this is |
use_gradient |
logical. Should gradient be used during numerical
optimization? Default is |
compiled_model , model , mmatrix , response |
logicals. If |
contrasts |
optional list. See the contrasts.arg of
|
ddm
uses model.matrix
for transforming the
symbolic description of the regression model underlying each parameter into
estimated coefficients. The following provides a few examples:
~ 1
estimates a single coefficient, named (Intercept)
~ condition
estimates the intercept plus k - 1 coefficients
for a factor with k levels (e.g., intercept plus one coefficient if
condition has two levels). The interpretation of the coefficients
depend on the factor contrasts employed, which are usually based on
the contrasts specified in options("contrasts")
. For the
default treatment
contrasts (contr.treatment
),
the intercept corresponds to the first factor level and the
additional coefficients correspond to the difference from the
intercept (i.e., first factor level). When using contr.sum
the intercept corresponds to the grand mean and the additional
coefficients correspond to the differences from the grand mean.
~ 0 + condition
estimates no intercept but one coefficient
per factor level. This specification can also be used to get one
coefficient per cell for a multi-factorial design, e.g.,
~ 0 + condition1:condition2
.
~ 0 + condition1 + condition1:condition2
estimates one
"intercept" per level of condition1
factor (which is not
called intercept) plus k - 1 difference parameters from the
condition-specific intercept for the k-levels of condition2
.
The interpretation of the difference parameters again depends on
the contrasts used (e.g., treatment vs. sum-to-zero contrasts, see
examples). This formula specification can often make sense for the
drift rate when condition1
is the factor (such as item type)
mapped to upper and lower response boundary of the DDM and
condition2
is another factor by which we want the drift rate
to differ. This essentially gives one overall drift rate per
response boundary plus the differences from this overall one (note
again that with treatment contrasts this overall drift rate is the
first factor level of condition2
).
To get meaningful results it is necessary to estimate separate drift rates for the different condition/item-types that are mapped onto the upper and lower boundary of the diffusion model.
If a non-default fitting function is used, it needs to minimize the
negative log-likelihood, accept the following arguments,
init, objective, gradient, lower, upper, control
, and return a list
with the following arguments coefficients, loglik, converged, optim
(where converged
is boolean and optim
can be an arbitrary
list with additional information).
Object of class ddm
(i.e., a list with components as listed
below) for which a number of common methods such as print
,
coef
, and logLik
are implemented, see
ddm-methods
.
coefficients
a named list whose elements are the values of
the estimated model parameters
dpar
a character vector containing the names of the
estimated model parameters
fixed_dpar
a named list whose elements are the values of the
fixed model parameters
loglik
the value of the log-likelihood function at the
optimized parameter values
hessians
a named list whose elements are the individual
Hessians for each of the model parameters
vcov
a named list whose elements are the individual
variance-covariance matrices for each of the model parameters
nobs
the number of observations in the data used for fitting
npar
the number of parameters used to fit the model (i.e.,
the estimated model parameters plus any hyperparameters)
df.residual
the residual degrees of freedom (the number of
observations - the number of parameters)
call
the original function call to ddm
formula
the formulas used in the model (1
indicates
that the model parameter was estimated with a single coefficient;
0
indicates that the model parameter was fixed)
dpar_formulas
a named list whose elements are the formulas
for the model parameters (1
indicates that the model
parameter was estimated with a single coefficient; 0
indicates that the model parameter was fixed)
na.action
na.action
terms
a named list whose elements are the model parameters,
except the last element is named full
and shows the
breakdown of the model with all model parameters
levels
a named list whose elements are the levels associated
with any parameters that are factors (the elements are NULL
if the parameter is not a factor), and whose last element is named
FULL
and shows all of the levels used in the model
contrasts
a named list whose elements are the type of
contrasts used in the model
args_ddm
a named list whose elements are the optional
arguments used in the calculation of the DDM log-likelihood
function
link
a named list whose elements show information about the
link function used for each model parameter (currently the only
link function is the identity function)
converged
a logical indicating whether the optimization
converged (TRUE
) or not (FALSE
)
optim_info
a named list whose elements are information about
the optimization process (e.g., the name of the algorithm used,
the final value of the objective function, the number of
evaluations of the gradient function, etc.)
model
the data used in the model (might need to check this)
response
the response data used in the model
mmatrix
a named list whose elements are the model matrices
for each of the estimated parameters
compiled_model
C++ object that contains the compiled model
(see list below for more details)
The C++ object accessible via the compiled_model
component of the
above R object of class ddm
contains the following components:
rt
a numeric vector of the response time data used in the
model
response
an integer vector of the response data used in the
model (coded such that 1
corresponds to the "lower" boundary
and 2
corresponds to the "upper" boundary)
err_tol
the error tolerance used in the calculations for
fitting the DDM
coefficients
a numeric vector containing the current set of
coefficients for the formulas provided to the ddm()
function
call; the coefficients correspond to the DDM parameters in the
following order: v
, a
, t0
, w
, sv
likelihood
a double containing the log-likelihood for the
current set of coefficients
(note this can be changed by
calling the function calculate_loglik()
below)
modmat_v
a numeric matrix containing the model matrix for
v
, the drift rate, determined by the formula input to the
argument drift
in the ddm()
function call
modmat_a
a numeric matrix containing the model matrix for
a
, the boundary separation, determined by the formula input
to the argument boundary
in the ddm()
function call
modmat_t0
a numeric matrix containing the model matrix for
t0
, the non-decision time, determined by the formula input
to the argument ndt
in the ddm()
function call
modmat_w
a numeric matrix containing the model matrix for
w
, the inital bias, determined by the formula input to the
argument bias
in the ddm()
function call
modmat_sv
a numeric matrix containing the model matrix for
sv
, the inter-trial variability in the drift rate,
determined by the formula input to the argument sv
in the
ddm()
function call
hess_v
a numeric matrix containing the Hessian for v
,
the drift rate, whose dimensions are determined by the formula
input to the argument drift
in the ddm()
function
call
hess_a
a numeric matrix containing the Hessian for a
,
the boundary separation, whose dimensions are determined by the
formula input to the argument drift
in the ddm()
function call
hess_t0
a numeric matrix containing the Hessian for
t0
, the non-decision time, whose dimensions are determined
by the formula input to the argument drift
in the
ddm()
function call
hess_w
a numeric matrix containing the Hessian for w
,
the initial bias, whose dimensions are determined by the formula
input to the argument drift
in the ddm()
function
call
hess_sv
a numeric matrix containing the Hessian for
sv
, the inter-trial variability in the drift rate, whose
dimensions are determined by the formula input to the argument
drift
in the ddm()
function call
vcov_v
a numeric matrix containing the variance-covariance
matrix for v
, the drift rate, whose dimensions are
determined by the formula input to the argument drift
in the
ddm()
function call
vcov_a
a numeric matrix containing the variance-covariance
matrix for a
, the boundary separation, whose dimensions are
determined by the formula input to the argument boundary
in
the ddm()
function call
vcov_t0
a numeric matrix containing the variance-covariance
matrix for t0
, the non-decision time, whose dimensions are
determined by the formula input to the argument ndt
in the
ddm()
function call
vcov_w
a numeric matrix containing the variance-covariance
matrix for w
, the inital bias, whose dimensions are
determined by the formula input to the argument bias
in the
ddm()
function call
vcov_sv
a numeric matrix containing the variance-covariance
matrix for v
, the inter-trial variability in the drift rate,
whose dimensions are determined by the formula input to the
argument sv
in the ddm()
function call
calculate_loglik
calculates and returns a double containing
the negated log-likelihood (note that this will overwrite the
likelihood
component of the C++ object)
calculate_gradient
calculates and returns a numeric vector
of the negated gradients for the provided coefficient values; the
gradients are stored in the same manner as their corresponding
coefficents
(note that this will overwrite the
likelihood
) component of the C++ object)
calculate_hessians
calculates and returns a named list of
the negated Hessians for each model parameter for the provided
coefficient values (note that this will overwrite the
likelihood
, hess_v
, hess_a
, hess_t0
,
hess_w
, and hess_sv
components of the C++ object)
calculate_vcov
calculates and returns a named list of the
variance-covariance matrices for each model parameter for the
stored coefficients
(note that this will overwrite the
likelihood
, hess_v
, hess_a
, hess_t0
,
hess_w
, hess_sv
, vcov_v
, vcov_a
,
vcov_t0
, vcov_w
, and vcov_sv
components of the
C++ object)
calculate_standard_error
calculates and returns a numeric
vector of the standard errors of the stored coefficients
;
the standard errors are stored in the same manner as their
corresponding coefficients
(note that this will overwrite
the likelihood
, hess_v
, hess_a
,
hess_t0
, hess_w
, hess_sv
, vcov_v
,
vcov_a
, vcov_t0
, vcov_w
, and vcov_sv
components of the C++ object)
# prepare data
data(med_dec, package = "fddm")
med_dec <- med_dec[which(med_dec[["rt"]] >= 0), ] ## only use valid RTs
## select data from one participant
p1 <- med_dec[med_dec[["id"]] == 2 & med_dec[["group"]] == "experienced", ]
head(p1)
##----------------------------------------------
## Easiest: Fitting using emmeans -
##----------------------------------------------
## Because we use an ANOVA approach, we set orthogonal sum-to-zero contrasts
op <- options(contrasts = c('contr.sum', 'contr.poly'))
fit0 <- ddm(rt + response ~ classification*difficulty, data = p1)
summary(fit0)
## for more tests, we use emmeans:
if (requireNamespace("emmeans")) {
# for ANOVA table:
emmeans::joint_tests(fit0)
# get conditional main effects of difficulty (for each level of classification):
emmeans::joint_tests(fit0, by = "classification")
# get mean drift rates per condition:
em1 <- emmeans::emmeans(fit0, "difficulty", by = "classification")
em1
# compare mean drift rates per condition
pairs(em1)
update(pairs(em1), by = NULL, adjust = "holm")
}
options(op) # reset contrasts
##----------------------------------------------------------------
## Fitting with custom parametrisation -
##----------------------------------------------------------------
## one drift rate per classification by difficulty design cell
fit1 <- ddm(rt + response ~ 0 + classification:difficulty, data = p1)
summary(fit1)
## set default contrasts (just in case contrasts have been changed)
op <- options(contrasts = c('contr.treatment', 'contr.poly'))
## one drift rate "intercept" per classification condition (blast vs. non-blast)
## corresponding to first level of difficulty factor (easy)
## plus one further coefficient per classification condition corresponding to
## difference from "intercept" (hard - easy)
fit1b <- ddm(rt + response ~ 0 + classification + classification:difficulty,
data = p1, args_optim = list(control = list(iter.max = 1000,
eval.max = 1000)))
summary(fit1b)
options(op) # reset contrasts
## set orthogonal sum-to-zero contrasts
op <- options(contrasts = c('contr.sum', 'contr.poly'))
## one drift rate "intercept" per classification condition (blast vs. non-blast)
## corresponding to mean drift rate for the classification condition
## plus one further coefficient per classification condition corresponding to
## difference from "intercept" (hard/easy - mean drift rate)
fit1c <- ddm(rt + response ~ 0 + classification + classification:difficulty,
data = p1)
summary(fit1c)
options(op) ## reset contrasts
## all variants produce same fit, only meaning of parameters differs
logLik(fit1)
logLik(fit1b)
logLik(fit1c)
logLik(fit0) ## also model above
## all models estimate same drift rates, but in different parametrisation:
coef(fit1) ## drift rates per design cell
## same drift rates based on fit1b:
c(coef(fit1b)[1:2],
coef(fit1b)[1] + coef(fit1b)[3], coef(fit1b)[2] + coef(fit1b)[4])
## same drift rates based on fit1c:
c(coef(fit1c)[1] + coef(fit1c)[3], coef(fit1c)[2] + coef(fit1c)[4],
coef(fit1c)[1] - coef(fit1c)[3], coef(fit1c)[2] - coef(fit1c)[4])
# we can estimate a model that freely estimates response bias
# (instead of fixing it at 0.5)
fit2 <- ddm(rt + response ~ 0 + classification:difficulty, bias = ~1, data = p1)
summary(fit2)
## Note: estimating bias only makes sense in situations like here where the
## response boundaries do not correspond to correct/incorrect but to the
## actual responses participants gave (here: blast vs. non-blast classification)
## now let's perform a likelihood ratio test to check if estimating response
## bias freely leads to a significant increase in model fit?
if (requireNamespace("lmtest")) { ## requires package lmtest
lmtest::lrtest(fit1, fit2)
## does not look like it (p = 0.1691)
}
# we can also make a DDM parameter, such as boundary, depend on a numeric
# variable, such as block number
fit3 <- ddm(rt + response ~ 0 + classification:difficulty,
boundary = ~ block, data = p1)
summary(fit3)
## does making boundary depend on block leads to a significant increase in model
## fit?
if (requireNamespace("lmtest")) { ## requires package lmtest
lmtest::lrtest(fit1, fit3)
## does not look like it (p = 0.198)
}
##----------------------------------------------------------------
## Fitting with optimization arguments -
##----------------------------------------------------------------
## example of how to use your own initial values and bounds for the optimization
## of the coefficients (determined by the model matrices/formulas)
options(op) # reset contrasts
# this uses the default generated initial values and bounds
fitex0 <- ddm(rt + response ~ 0 + classification:difficulty, data = p1)
# we can see the number of coefficients (and thus the number of initial values
# and bounds) required if we wish to use our own
fitex0$optim_info$args_optim$init
fitex0$optim_info$args_optim$lo_bds
fitex0$optim_info$args_optim$up_bds
# -1.9270279 2.5408864 -1.9270279 0.3181987 1.7907438 0.1264035
# -Inf -Inf -Inf -Inf 0 0
# Inf Inf Inf Inf Inf 0.461
# the first four coefficients are for the drift rate (given by the formula we
# provided), and the last two are for the boundary separation and non-decision
# time, respectively (note that the default is to estimate these two model
# parameters with a single coefficient).
# to use our own initial values, we can include them in the `args_optim` list,
# but note that they must be within the generated bounds
fitex1 <- ddm(rt + response ~ 0 + classification:difficulty, data = p1,
args_optim = list(init = c(-1.5, 1, -1, 1, 1.5, 0.3)))
# to use our own bounds, we again can include them in the `args_optim` list,
# but note that they must contain the generated initial values
fitex2 <- ddm(rt + response ~ 0 + classification:difficulty, data = p1,
args_optim = list(lo_bds = c(-20, -20, -20, -20, 0, 0),
up_bds = c(20, 20, 20, 20, 10, 0.5)))
# to use both our own initial values and bounds, we include them in the
# `args_optim` list, and the bounds must contain the initial values
fitex3 <- ddm(rt + response ~ 0 + classification:difficulty, data = p1,
args_optim = list(init = c(-1.5, 1, -1, 1, 1.5, 0.3),
lo_bds = c(-20, -20, -20, -20, 0, 0),
up_bds = c(20, 20, 20, 20, 10, 0.5)))
# the only other option in the `args_optim` list is the control parameters
# directly used by the optimization function (e.g., the maximum number of
# iterations); here we'll set the maximum number of iterations and function
# evaluations
fitex4 <- ddm(rt + response ~ 0 + classification:difficulty, data = p1,
args_optim = list(control = list(iter.max = 1000,
eval.max = 1000)))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.