weight_by: Create dataset according to its frequency weights

View source: R/weight_by.R

weight_byR Documentation

Create dataset according to its frequency weights

Description

This is a "brute force" weighting procedure. Each row of the dataset is replicated "case weight" times. If 'weight' is not integer it will be rounded to the nearest integer. So cases with weight less than 0.5 will be removed from the dataset. Such weighting is used in the several statistical procedures in the SPSS Statistic, e. g. for the Spearman correlation coefficient or GLM.

Usage

weight_by(data, weight = NULL)

Arguments

data

data.frame, data.table or matrix. Dataset which will be weighted.

weight

unquoted column name of weights in 'data' or vector of weights. If it is NULL 'data' will be returned unchanged.

Value

'data' with each row replicated according to case weight.

Examples

data(state) # US states
# convert matrix to data.table
states = data.table(state.x77, keep.rownames = "State")

# create weighted dataset
states_weighted = states %>% 
    let(
        # calculate 'weight' variable. 
        weight = Population/100
    ) %>% 
    weight_by(weight)

# Each row in the weighted dataset is represented proportionally to the population of the state
nrow(states) # unweigthed number of cases
nrow(states_weighted) # number of cases in the weighted dataset
str(states_weighted)

gdemin/expss documentation built on April 13, 2024, 2:32 p.m.