summarize_rep_weights: Summarize the replicate weights

View source: R/summarize_rep_weights.R

summarize_rep_weightsR Documentation

Summarize the replicate weights

Description

Summarize the replicate weights of a design

Usage

summarize_rep_weights(rep_design, type = "both", by)

Arguments

rep_design

A replicate design object, created with either the survey or srvyr packages.

type

Default is "both". Use type = "overall", for an overall summary of the replicate weights. Use type = "specific" for a summary of each column of replicate weights, with each column of replicate weights summarized in a given row of the summary.

Use type = "both" for a list containing both summaries, with the list containing the names "overall" and "both".

by

(Optional) A character vector with the names of variables used to group the summaries.

Value

If type = "both" (the default), the result is a list of data frames with names "overall" and "specific". If type = "overall", the result is a data frame providing an overall summary of the replicate weights.

The contents of the "overall" summary are the following:

  • "nrows": Number of rows for the weights

  • "ncols": Number of columns of replicate weights

  • "degf_svy_pkg": The degrees of freedom according to the survey package in R

  • "rank": The matrix rank as determined by a QR decomposition

  • "avg_wgt_sum": The average column sum

  • "sd_wgt_sums": The standard deviation of the column sums

  • "min_rep_wgt": The minimum value of any replicate weight

  • "max_rep_wgt": The maximum value of any replicate weight

If type = "specific", the result is a data frame providing a summary of each column of replicate weights, with each column of replicate weights described in a given row of the data frame. The contents of the "specific" summary are the following:

  • "Rep_Column": The name of a given column of replicate weights. If columns are unnamed, the column number is used instead

  • "N": The number of entries

  • "N_NONZERO": The number of nonzero entries

  • "SUM": The sum of the weights

  • "MEAN": The average of the weights

  • "CV": The coefficient of variation of the weights (standard deviation divided by mean)

  • "MIN": The minimum weight

  • "MAX": The maximum weight

Examples


# Load example data
suppressPackageStartupMessages(library(survey))
data(api)

dclus1 <- svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
dclus1$variables$response_status <- sample(x = c("Respondent", "Nonrespondent",
                                                 "Ineligible", "Unknown eligibility"),
                                           size = nrow(dclus1),
                                           replace = TRUE)
rep_design <- as.svrepdesign(dclus1)

# Adjust weights for cases with unknown eligibility
ue_adjusted_design <- redistribute_weights(
    design = rep_design,
    reduce_if = response_status %in% c("Unknown eligibility"),
    increase_if = !response_status %in% c("Unknown eligibility"),
    by = c("stype")
)

# Summarize replicate weights

summarize_rep_weights(rep_design, type = "both")

# Summarize replicate weights by grouping variables

summarize_rep_weights(ue_adjusted_design, type = 'overall',
                      by = c("response_status"))

summarize_rep_weights(ue_adjusted_design, type = 'overall',
                      by = c("stype", "response_status"))

# Compare replicate weights

rep_wt_summaries <- lapply(list('original' = rep_design,
                                'adjusted' = ue_adjusted_design),
                           summarize_rep_weights,
                           type = "overall")
print(rep_wt_summaries)


bschneidr/svrep documentation built on Feb. 11, 2025, 4:24 a.m.