complete_data: Filter out entities with too few observations

View source: R/panel_data.R

complete_dataR Documentation

Filter out entities with too few observations

Description

This function allows you to define a minimum number of waves/periods and exclude all individuals with fewer observations than that.

Usage

complete_data(data, ..., formula = NULL, vars = NULL, min.waves = "all")

Arguments

data

A panel_data() frame.

...

Optionally, unquoted variable names/expressions separated by commas to be passed to dplyr::select(). Otherwise, all columns are included if formula and vars are also NULL.

formula

A formula, like the one you'll be using to specify your model.

vars

As an alternative to formula, a vector of variable names.

min.waves

What is the minimum number of observations to be kept? Default is "all", but it can be any number.

Details

If ... (that is, unquoted variable name(s)) are included, then formula and vars are ignored. Likewise, formula takes precedence over vars. These are just different methods for selecting variables and you can choose whichever you prefer/are comfortable with. ... corresponds with the "tidyverse" way, formula is useful for programming or working with model formulas, and vars is a "standard" evaluation method for when you are working with strings.

Value

A panel_data frame.

Examples


data("WageData")
wages <- panel_data(WageData, id = id, wave = t)
complete_data(wages, wks, lwage, min.waves = 3)


panelr documentation built on Aug. 22, 2023, 5:08 p.m.