View source: R/add_weights_cluster.R
add_weights_cluster | R Documentation |
For use in surveys where you took a sample population out of a larger source population, with a cluster survey design.
add_weights_cluster( x, cl, eligible, interviewed, cluster_x = NULL, cluster_cl = NULL, household_x = NULL, household_cl = NULL, ignore_cluster = TRUE, ignore_household = TRUE, surv_weight = "surv_weight", surv_weight_ID = "surv_weight_ID" )
x |
a data frame of survey data |
cl |
a data frame containing a list of clusters and the number of households in each. |
eligible |
the column in |
interviewed |
the column in |
cluster_x |
the column in |
cluster_cl |
the column in |
household_x |
the column in |
household_cl |
the column in |
ignore_cluster |
If TRUE (default), set the weight for clusters to be 1.
This assumes that your sample was taken in a way which is a close
approximation of a simple random sample. Ignores inputs from |
ignore_household |
If TRUE (default), set the weight for households to
be 1. This assumes that your sample of households was takenin a way which
is a close approximation of a simple random sample. Ignores inputs from
|
surv_weight |
the name of the new column to store the weights. Defaults to "surv_weight". |
surv_weight_ID |
the name of the new ID column to be created. Defaults to "surv_weight_ID" |
Will multiply the inverse chances of a cluster being selected, a household being selected within a cluster, and an individual being selected within a household.
As follows:
((clusters available) / (clusters surveyed)) * ((households in each cluster) / (households surveyed in each cluster)) * ((individuals eligible in each household) / (individuals interviewed))
In the case where both ignore_cluster and ignore_household are TRUE, this will simply be:
1 * 1 * (individuals eligible in each household) / (individuals interviewed)
Alex Spina, Zhian N. Kamvar, Lukas Richter
# define a fake dataset of survey data # including household and individual information x <- data.frame(stringsAsFactors=FALSE, cluster = c("Village A", "Village A", "Village A", "Village A", "Village A", "Village B", "Village B", "Village B"), household_id = c(1, 1, 1, 1, 2, 2, 2, 2), eligible_n = c(6, 6, 6, 6, 6, 3, 3, 3), surveyed_n = c(4, 4, 4, 4, 4, 3, 3, 3), individual_id = c(1, 2, 3, 4, 4, 1, 2, 3), age_grp = c("0-10", "20-30", "30-40", "50-60", "50-60", "20-30", "50-60", "30-40"), sex = c("Male", "Female", "Male", "Female", "Female", "Male", "Female", "Female"), outcome = c("Y", "Y", "N", "N", "N", "N", "N", "Y") ) # define a fake dataset of cluster listings # including cluster names and number of households cl <- tibble::tribble( ~cluster, ~n_houses, "Village A", 23, "Village B", 42, "Village C", 56, "Village D", 38 ) # add weights to a cluster sample # include weights for cluster, household and individual levels add_weights_cluster(x, cl = cl, eligible = eligible_n, interviewed = surveyed_n, cluster_cl = cluster, household_cl = n_houses, cluster_x = cluster, household_x = household_id, ignore_cluster = FALSE, ignore_household = FALSE) # add weights to a cluster sample # ignore weights for cluster and household level (set equal to 1) # only include weights at individual level add_weights_cluster(x, cl = cl, eligible = eligible_n, interviewed = surveyed_n, cluster_cl = cluster, household_cl = n_houses, cluster_x = cluster, household_x = household_id, ignore_cluster = TRUE, ignore_household = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.