| flag_contaminant_rts | R Documentation |
Identifies contaminant RTs (fast guesses, attention lapses) at the trial level using mixture modeling. For each trial, it computes the posterior probability of being a contaminant given a mixture of a uniform distribution (contaminants) and an RT distribution.
The function takes a numeric vector of RTs and returns a numeric vector of
contamination probabilities, making it compatible with dplyr::mutate() and
dplyr::group_by() workflows.
flag_contaminant_rts(
rt,
distribution = c("exgaussian", "lognormal", "invgaussian"),
contaminant_bound = c("min", "max"),
init_contaminant = 0.05,
max_contaminant = 0.5,
maxit = 100,
tol = 1e-06
)
rt |
Numeric vector. Reaction times in seconds. Must be positive. |
distribution |
Character. RT distribution for the mixture model: "exgaussian" (default), "lognormal", or "invgaussian". |
contaminant_bound |
Vector of length 2. Bounds |
init_contaminant |
Numeric. Initial contaminant proportion for EM algorithm. Must be in (0, 1). Default 0.05. |
max_contaminant |
Numeric. Maximum allowed contaminant proportion. Values exceeding this are clipped with a warning. Must be in (0, 1]. Default 0.5. |
maxit |
Integer. Maximum EM iterations. Default 100. |
tol |
Numeric. Convergence tolerance for log-likelihood. Default 1e-6. |
The function fits: f(RT) = pi_c * Uniform(a,b) + (1-pi_c) * f_RT(RT|theta)
where pi_c is the contaminant proportion, Uniform(a,b) is the contaminant
distribution over contaminant_bound, and f_RT is the specified RT
distribution with parameters theta.
To fit separate mixtures by condition or response boundary, use
dplyr::group_by() before calling this function inside dplyr::mutate().
Mixture fit diagnostics (parameters, convergence, log-likelihood) are
attached as the "diagnostics" attribute of the returned vector.
Access them with attr(result, "diagnostics").
Numeric vector of posterior contamination probabilities P(contaminant
| RT), with a "diagnostics" attribute containing a one-row data.frame
with columns: mixture_params (list), contaminant_prop, converged,
iterations, loglik, n_trials, distribution, method.
ezdm_summary_stats() for aggregated RT statistics with contamination
handling, validate_fast_guesses() for testing whether flagged contaminants
show random guessing behavior
## Not run:
# Simulate data with contaminants
library(bmm)
set.seed(123)
rt_clean <- rgamma(150, shape = 5, rate = 10)
rt_contam <- runif(50, 0.1, 0.2)
data <- data.frame(
rt = c(rt_clean, rt_contam),
subject = 1,
response = sample(c("upper", "lower"), 200, replace = TRUE)
)
# Basic usage with mutate
library(dplyr)
data <- data |>
mutate(contam_prob = flag_contaminant_rts(rt))
# Hard threshold: remove trials with P(contaminant) > 0.5
data_clean <- data |> filter(contam_prob <= 0.5)
# Separate fits by response boundary
data <- data |>
group_by(subject, response) |>
mutate(contam_prob = flag_contaminant_rts(rt))
# Access diagnostics
probs <- flag_contaminant_rts(data$rt)
attr(probs, "diagnostics")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.