View source: R/sced_analysis.r
sced_analysis | R Documentation |
Analyse data from an AB design SCED experiment using non-parametric frequentist tests
sced_analysis( data, n_boots = 2000, invert_effect_sizes = FALSE, adjust_probability_ceiling = TRUE, assess_deviation_outliers = FALSE )
data |
Experiment data. This must contain columns named "Participant", "Timepoint" (integer), "Score" (numeric; your DV), and "Condition" (must include only "A" and "B" as a string or factor). See the included simulated_data dataset for an example using |
n_boots: |
number of bootstrapped resamples for Hedges' g and Ruscio's A. N for p value permutation is n_boots*10. |
invert_effect_sizes: |
Effect sizes are reported assuming that scores in timepoint B are expected to be higher than timepoint A (i.e., that the intervention causes scores to increase). If invert_effect_sizes == TRUE then effect sizes are inverted, e.g., if the intervention is expected to causes scores to decrease. |
adjust_probability_ceiling: |
Should Ruscio's A estimates of 0 and 1 be adjusted so that they can be converted to finite odds ratios? This is done by rescoring a single data point as being was inferior to a single second data point between the conditions. I.e., it uses the best granularity allowed by the data, as more data points will result in a more extreme possible values of A. |
assess_deviation_outliers: |
By default, a meta analysis of deviations at baseline is run in order to detect outliers. SCED analysts often argue that you should start with a stable baseline. This represents an attempt to quantify what counts as stable vs not by comparing baseline variation between participants. However, the code may throw an error here if any participant's variation is extremely low. If so, you can simply not run this analysis by setting this parameter to FALSE. |
n_A: Number of data points in phase A
n_B: Number of data points in phase B
deviation_A: Median Absolute Deviation of data points in phase A
deviation_B: Median Absolute Deviation of data points in phase B
deviation_A_likely_outlier: It can be useful to know if a participant demonstrates poorer consistency within the baseline phase (phase A). There is a tradition within the visual analysis of SCED data to exclude participants who demonstrate such low 'consistency'. In addition to this, the meta analysis of the unstandardized effect size between participants (i.e., the calculation of the median median-difference between phases) tacitly relies on equal variances between participants. Although, it should be noted that both standardized effect sizes (Ruscio's A and Hedges' g) do *not* rely on the assumption of equal variance between participants (indeed, this is one of the rationales for standardization), and so no exclusions based on baseline consistency are necessary for the correct interpretation of them. In order to highlight outliers, this package relies on between subject meta analysis of median absolute deviations of the data in phase A. The metafor package's influence.rma.uni() function is used to conduct leave one out analyses. Four separate metrics that can indicate whether a participant are calculated, and if one or more flag the participant as a potential outlier this variable is returned with a star ("*"). See the metafor package's documentation for more details. This variable should not be used to thoughtlessly exclude participants, but should be combined with insepction of the SCED plot and best judgement. Sensitivity analyses run with and wihtout any outliers can also be useful.
trend_A: ordinal regression slope between the phase A data points by timepoint. Treats the timepoints as ordinally spaced integers (e.g., rather than modelling them as dates). Can be used to exclude participants from consideration in meta analysis, e.g., on the basis that differences between phases can be difficult to interpret if improvement trends are observed in phase A. Sensitivity analyses can be conducted by excluding participants with phase A trends greater than a given absolute value (e.g., +/-0.3).
trend_B: ordinal regression slope between the phase B data points by timepoint. Can indicate ongoing improvement in phase B.
p: Hypothesis test p value via permutation test assessing the differences in scores between the two phases. Calculated via Monte-Carlo simulation (10000 runs) rather than brute force.
median_difference: Difference in scores between the median values in the two phases. Because Ruscio's A effect size is probability based (see below) it can suffer from ceiling effects. I.e., when all timepoints in phase B are larger than all timepoints in phase A, Ruscio's A cannot distinguish futher between large and very large effect sizes. It can therefore be useful to report the unstandardized effect size as well as the standardized effect sizes.
ruscios_A: Differences in scores between the two phases using Ruscio's A (Ruscio, 2008). The probability that a randomly selected timepoint in phase B is larger than a randomly selected timepoint in phase A. Ranges from 0 to 1, where 0.5 is equal chance. Highly similar to Area Under the Curve (AUC)/the Common Language Effect Size (CLES)/and the probability of superiority. A Cohen's d of 1.5 corrisponds to an A of 0.85.
hedges_g: Differences in scores between the two phases using Hedge's g effect size via bootstrapping, a version of Cohen's d that is bias corrected for small sample sizes. Identical range, interpretation and cutoffs as Cohen's d. Included here for familiarity: it's parametric assumtions (equal variances) and sensitivity to equal number of timepoints in A and B make it somewhat unrobust in many SCED contexts. In order to relax the assumption of normality a bootstrapped implemenation is employed.
sced_results <- sced_analysis(data = simulated_data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.