knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
knitr::opts_chunk$set(echo = TRUE) # Set a random seed set.seed(5989320) # Libraries necessary for this vignette library(rio) library(flextable) library(dplyr) library(tidyr) library(psych) library(semanticprimeR) # Function for simulation item_power <- function(data, # name of data frame dv_col, # name of DV column as a character item_col, # number of items column as a character nsim = 10, # small for cran sample_start = 20, sample_stop = 200, sample_increase = 5, decile = .5){ DF <- cbind.data.frame( "dv" = data[ , dv_col], "items" = data[ , item_col] ) # just in case colnames(DF) <- c("dv", "items") # figure out the "sufficiently narrow" ci value SE <- tapply(DF$dv, DF$items, function (x) { sd(x)/sqrt(length(x)) }) cutoff <- quantile(SE, probs = decile) # sequence of sample sizes to try samplesize_values <- seq(sample_start, sample_stop, sample_increase) # create a blank table for us to save the values in sim_table <- matrix(NA, nrow = length(samplesize_values)*nsim, ncol = length(unique(DF$items))) # make it a data frame sim_table <- as.data.frame(sim_table) # add a place for sample size values sim_table$sample_size <- NA iterate <- 1 for (p in 1:nsim){ # loop over sample sizes for (i in 1:length(samplesize_values)){ # temp that samples and summarizes temp <- DF %>% group_by(items) %>% sample_n(samplesize_values[i], replace = T) %>% summarize(se = sd(dv)/sqrt(length(dv))) # dv on items colnames(sim_table)[1:length(unique(DF$items))] <- temp$items sim_table[iterate, 1:length(unique(DF$items))] <- temp$se sim_table[iterate, "sample_size"] <- samplesize_values[i] sim_table[iterate, "nsim"] <- p iterate <- iterate + 1 } } # figure out cut off final_sample <- sim_table %>% pivot_longer(cols = -c(sample_size, nsim)) %>% dplyr::rename(item = name, se = value) %>% group_by(sample_size, nsim) %>% summarize(percent_below = sum(se <= cutoff)/length(unique(DF$items))) %>% ungroup() %>% # then summarize all down averaging percents dplyr::group_by(sample_size) %>% summarize(percent_below = mean(percent_below)) %>% dplyr::arrange(percent_below) %>% ungroup() return(list( SE = SE, cutoff = cutoff, DF = DF, sim_table = sim_table, final_sample = final_sample )) }
Attractiveness Ratings
Data provided by: Carlota Batres
This dataset contains 200 participants rating 20 faces on attractiveness. Ethical approval was received from the Franklin and Marshall Institutional Review Board and each participant provided informed consent. All participants were located in the United States. Participants were instructed that they would be viewing several faces which were photographed facing forward, under constant camera and lighting conditions, with neutral expressions, and closed mouths. Each participant would have to rate the attractiveness of the presented faces. More specifically, participants were asked "How attractive is this face?", where 1 = "Not at all attractive" and 7 = "Very attractive". Participants rated each face individually, in random order, and with no time limit. Upon completion, participants were paid for participation in the study.
The data was collected online using Amazon’s Mechanical Turk platform.
Included with the vignette.
DF <- import("data/batres_data.sav") str(DF)
No official publication date.
Batres, C. (2022). Attractiveness Ratings. [Data set].
faces, ratings
Attribution-NonCommercial-ShareAlike CC BY-NC-SA
United States
metadata <- import("data/batres_metadata.xlsx") flextable(metadata) %>% autofit()
The data should be in long format with each rating on one row of data.
# Reformat the data DF_long <- pivot_longer(DF, cols = -c(Participant_Number)) %>% dplyr::rename(item = name, score = value) flextable(head(DF_long)) %>% autofit()
# Function for simulation var1 <- item_power(data = DF_long, # name of data frame dv_col = "score", # name of DV column as a character item_col = "item", # number of items column as a character nsim = 10, sample_start = 20, sample_stop = 300, sample_increase = 5, decile = .4)
What the usual standard error for the data that could be considered for our stopping rule using the 40%% decile?
# individual SEs var1$SE var1$cutoff
Using our 40%% decile as a guide, we find that r round(var1$cutoff, digits = 3)
is our target standard error for an accurately measured item.
To estimate minimum sample size, we should figure out what number of participants it would take to achieve 80%, 85%, 90%, and 95% of the SEs for items below our critical score of r round(var1$cutoff, digits = 3)
?
cutoff <- calculate_cutoff(population = DF_long, grouping_items = "item", score = "score", minimum = 1, maximum = 7) # showing how this is the same as the person calculated version versus semanticprimeR's function cutoff$cutoff
Please note that you will always need to simulate larger than the pilot data sample size to get the starting numbers. We will correct them below. As shown in our manuscript, we need to correct for the overestimation of sample sizes based on the original pilot data size. Given that the pilot data is large: r nrow(DF)
, this correction is especially useful. This correction is built into our function.
final_table <- calculate_correction( proportion_summary = var1$final_sample, pilot_sample_size = nrow(DF), proportion_variability = cutoff$prop_var ) flextable(final_table) %>% autofit()
Our minimum suggested sample size does not exist at exactly 80% of the items, but instead we can use the first available over 80% (n = r final_table %>% filter(percent_below >= 80) %>% arrange(percent_below, corrected_sample_size) %>% slice_head() %>% pull(corrected_sample_size) %>% round()
as the minimum).
While there are many considerations for maximum sample size (time, effort, resources), the simulation suggests that r final_table %>% filter(percent_below >= 95) %>% arrange(percent_below, corrected_sample_size) %>% slice_head() %>% pull(corrected_sample_size) %>% round()
people would ensure nearly all items achieve cutoff criterions.
In any estimate for sample size, you should also consider the potential for missing data and/or unusable data due to any other exclusion criteria in your study (i.e., attention checks, speeding, getting the answer right, etc.). In this study, we likely expect all participants to see all items and therefore, we could expect to use the minimum sample size as our final sample size, the point at which all items reach our SE criterion, or the maximum sample size.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.