survey: Functions for post-stratification adjustments in surveys

Description Usage Arguments Details See Also

Description

# Main utilities here:

### Functions for linear combinations

Post-stratification Jacknife estimate of SE for a linear combination

Usage

1
2
3
std(x, div = sum(x))

jk_lin_comb_se(x, by, w, check = TRUE)

Arguments

x

vector of weights

div

divisor (default: sum(x))

by

stratification variable or list of stratification variable. Default: single level for unstratified samples

w

vector of weights, e.g. Horvitz-Thomson weights or renormed version. Default: 1

Details

capply(x, by, FUN, ...): applies FUN to each 'chunk' of x defined by levels of 'by'. "..." are additional arguments to FUN. 'by' can be a variable or a list, often a set of columns in the data set in which 'x' lives. The result has the same shape as 'x' and can thus be added as a variable to the data frame for 'x'. If FUN returns a single value in each chunk, the value is recycled to fill the elements corresponding to the chunk.

For example, in a data frame, data, with variables: state, county, sex, population where population is the population size within each county (within state) by sex combination: > data$pop.state <- with(data, capply(population, list(sex,state), sum)) creates a variable that is equal to the total population within each state x sex combination repeated, of course, for each county.

up(data, by) keeps rows corresponding to unique values of the strata defined by the variable or list of variables in 'by'. 'by' can also be a formula, evaluated in 'data'. For example, 'by = ~ a + b', is equivalent to 'by = data[,c('a','b')]'. For example, with 'data' used above:

> data.state <- up(data, ~ state + sex)

creates a data frame with one row per state x sex combination and the variable 'pop.state' contains the total population in in each combination. Variables, such as 'county' and 'population' that are not invariant within levels of 'by' are dropped.

tab(data, fmla):

creates a frequency table showing the number of rows in 'data' for each stratum formed by evaluating the formula 'fmla' in 'data'.

For example 'tab(data, ~ a + b + c)' will return an array of frequencies for variables 'a', 'b', and 'c' in 'data'. If the formula has a variable on the left side, that variable is summed to get the entries of the table.

For example, if 'population' contains the population in each row where each row is a 'sex x state x county' combination:

> tab(data, population ~ sex + age)

will produce a table of total population by sex and age, and

> tab(data, population ~ age)

will show overall totals by age group.

These tables can be tranformed into data frames with, e.g.,

> as.data.frame(tab(data, population ~ age))

Standardize a vector of weights

See Also

wtd_mean for weighted means, link{lin_comb} for linear combinations, jk_wtd_mean_se for a jacknife estimate of the SE of a weighted mean and jk_lin_comb_se for a jacknife estimate of the SE of a linear combination.


gmonette/WWCa documentation built on May 17, 2019, 7:25 a.m.