rapidsplit | R Documentation |
A very fast algorithm for computing stratified permutation-based split-half reliability.
rapidsplit(
data,
subjvar,
diffvars = NULL,
stratvars = NULL,
subscorevar = NULL,
aggvar,
splits = 6000,
aggfunc = c("means", "medians"),
errorhandling = list(type = c("none", "fixedpenalty"), errorvar = NULL, fixedpenalty =
600, blockvar = NULL),
standardize = FALSE,
include.scores = TRUE,
verbose = TRUE,
check = TRUE
)
## S3 method for class 'rapidsplit'
print(x, ...)
## S3 method for class 'rapidsplit'
plot(
x,
type = c("average", "minimum", "maximum", "random", "all"),
show.labels = TRUE,
...
)
rapidsplit.chunks(
data,
subjvar,
diffvars = NULL,
stratvars = NULL,
subscorevar = NULL,
aggvar,
splits = 6000,
aggfunc = c("means", "medians"),
errorhandling = list(type = c("none", "fixedpenalty"), errorvar = NULL, fixedpenalty =
600, blockvar = NULL),
standardize = FALSE,
include.scores = TRUE,
verbose = TRUE,
check = TRUE,
chunks = 4,
cluster = NULL
)
data |
Dataset, a |
subjvar |
Subject ID variable name, a |
diffvars |
Names of variables that determine which conditions
need to be subtracted from each other, |
stratvars |
Additional variables that the splits should
be stratified by; a |
subscorevar |
A |
aggvar |
Name of variable whose values to aggregate, a |
splits |
Number of split-halves to average, an |
aggfunc |
The function by which to aggregate the variable
defined in |
errorhandling |
A list with 4 named items, to be used to replace error trials
with the block mean of correct responses plus a fixed penalty, as in the IAT D-score.
The 4 items are |
standardize |
Whether to divide by scores by the subject's SD; a |
include.scores |
Include all individual split-half scores? |
verbose |
Display progress bars? Defaults to |
check |
Check input for possible problems? |
x |
|
... |
Ignored. |
type |
Character argument indicating what should be plotted.
By default, this plots the random split whose correlation is closest to the average.
However, this can also plot the random split with
the |
show.labels |
Should participant IDs be shown above their points in the scatterplot?
Defaults to |
chunks |
Number of chunks to divide the splits in, for more memory-efficient computation, and to divide over multiple cores if requested. |
cluster |
Chunks will be run on separate cores if a cluster is provided,
or an |
The order of operations (with optional steps between brackets) is:
Splitting
(Replacing error trials within block within split)
Computing aggregates per condition (per subscore) per person
Subtracting conditions from each other
(Dividing the resulting (sub)score by the SD of the data used to compute that (sub)score)
(Averaging subscores together into a single score per person)
Computing the covariances of scores from one half with scores from the other half for every split
Computing the variances of scores within each half for every split
Computing the average split-half correlation with the average variances and covariance
across all splits, using corStatsByColumns()
Applying the Spearman-Brown formula to the absolute correlation
using spearmanBrown()
, and restoring the original sign after
cormean()
was used to aggregate correlations in previous versions
of this package & in the associated manuscript, but the method based on
(co)variance averaging was found to be more accurate. This was suggested by
prof. John Christie of Dalhousie University.
A list
containing
r
: the averaged reliability.
ci
: the 95% confidence intervals.
allcors
: a vector with the reliability of each iteration.
nobs
: the number of participants.
scores
: the individual participants scores in each split-half,
contained in a list with two matrices (Only included if requested with include.scores
).
This function can use a lot of memory in one go.
If you are computing the reliability of a large dataset or you have little RAM,
it may pay off to use the sequential version of this function instead:
rapidsplit.chunks()
It is currently unclear it is better to pre-process your data before or after splitting it.
If you are computing the IAT D-score,
you can therefore use errorhandling
and standardize
to perform these two actions
after splitting, or you can process your data before splitting and forgo these two options.
Sercan Kahveci
Kahveci, S., Bathke, A.C. & Blechert, J. (2024) Reaction-time task reliability is more accurately computed with permutation-based split-half correlations than with Cronbach’s alpha. Psychonomic Bulletin and Review. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3758/s13423-024-02597-y")}
data(foodAAT)
# Reliability of the double difference score:
# [RT(push food)-RT(pull food)] - [RT(push object)-RT(pull object)]
frel<-rapidsplit(data=foodAAT,
subjvar="subjectid",
diffvars=c("is_pull","is_target"),
stratvars="stimid",
aggvar="RT",
splits=100)
print(frel)
plot(frel,type="all")
# Compute a single random split-half reliability of the error rate
rapidsplit(data=foodAAT,
subjvar="subjectid",
aggvar="error",
splits=1,
aggfunc="means")
# Compute the reliability of an IAT D-score
data(raceIAT)
rapidsplit(data=raceIAT,
subjvar="session_id",
diffvars="congruent",
subscorevar="blocktype",
aggvar="latency",
errorhandling=list(type="fixedpenalty",errorvar="error",
fixedpenalty=600,blockvar="block_number"),
splits=100,
standardize=TRUE)
# Unstratified reliability of the median RT
rapidsplit.chunks(data=foodAAT,
subjvar="subjectid",
aggvar="RT",
splits=100,
aggfunc="medians",
chunks=8)
# Compute the reliability of Tukey's trimean of the RT
# on 2 CPU cores
trimean<-function(x){
sum(quantile(x,c(.25,.5,.75))*c(1,2,1))/4
}
rapidsplit.chunks(data=foodAAT,
subjvar="subjectid",
aggvar="RT",
splits=200,
aggfunc=trimean,
cluster=2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.