stratify | R Documentation |
The function stratify()
takes as input any data frame with observations (rows) that you wish to stratify into clusters. Typically, the goal of such stratification is developing a sampling design for maximizing generalizability. This function, and the others in this package, are designed to mimic the website https://www.thegeneralizer.org/.
stratify(
data = NULL,
guided = TRUE,
n_strata = NULL,
variables = NULL,
idvar = NULL,
verbose = TRUE
)
data |
data.frame object containing the population data to be stratified (observations as rows); must include a unique id variable for each observation, as well as covariates. |
guided |
logical, defaults to TRUE. Whether the function should be guided (ask questions and behave interactively throughout) or not. If set to FALSE, the user must provide values for other arguments below |
n_strata |
integer, defaults to NULL. If guided is set to FALSE, must provide a number of strata in which to divide to cluster population |
variables |
character, defaults to NULL. If guided is set to FALSE, must provide a character vector of the names of stratifying variables (from population data frame) |
idvar |
character, defaults to NULL. If guided is set to FALSE, must provide a character vector of the name of the ID variable (from population data frame) |
verbose |
logical, defaults to TRUE. |
The list contains 14 components: idvar
, variables
, dataset
, n_strata
, solution
, pop_data_by_stratum
, summary_stats
, data_omitted
, cont_data_stats
, cat_data_levels
, heat_data
, heat_data_simple
, heat_data_kable
, and heat_plot
.
pop_data_by_stratum
: a tibble with number of rows equal to the number of rows in the inference population (data
) and number of columns equal to the number of stratifying variables (dummy-coded if applicable) plus the ID column (idvar
) and a column representing stratum membership, Stratum
The function returns a list of class "generalizeR_stratify" that can be provided as input to recruit()
. More information on the components of this list can be found above under "Details."
Tipton, E. (2014). Stratified sampling using cluster analysis: A sample selection strategy for improved generalizations from experiments. Evaluation Review, 37(2), 109-139.
Tipton, E. (2014). How generalizable is your experiment? An index for comparing experimental samples and populations. Journal of Educational and Behavioral Statistics, 39(6), 478-501.
library(tidyverse)
selection_covariates <- c("total", "pct_black_or_african_american",
"pct_white", "pct_female", "pct_free_and_reduced_lunch")
stratify(generalizeR:::inference_pop, guided = FALSE, n_strata = 4,
variables = selection_covariates, idvar= "ncessch")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.