EstimationData: Select the data from a data frame for estimation.

View source: R/estimationdata.R

EstimationDataR Documentation

Select the data from a data frame for estimation.

Description

Conducts imputation if necessary.

Usage

EstimationData(
  formula = NULL,
  data = NULL,
  subset = NULL,
  weights = NULL,
  missing = "Exclude cases with missing data",
  m = 10,
  seed = 12321,
  error.if.insufficient.obs = TRUE,
  remove.missing.levels = TRUE,
  impute.full.data = TRUE
)

Arguments

formula

An object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of type specification are given under ‘Details’.

data

A data.frame.

subset

An optional vector specifying a subset of observations to be used in the fitting process, or, the name of a variable in data. It may not be an expression. subset may not

weights

An optional vector of sampling weights, or, the name of a variable in data. It may not be an expression.

missing

How missing data is to be treated in the regression. Options are: "Error if missing data", "Exclude cases with missing data", "Use partial data", "Use partial data (pairwise correlations)", "Dummy variable adjustment", "Imputation (replace missing values with estimates)", and "Multiple imputation".

m

Number of imputation samples.

seed

The random number seed used in the imputation.

error.if.insufficient.obs

Throw an error if there are more variables than observations.

remove.missing.levels

Logical; whether levels are removed if they do not occur in the observed data.

impute.full.data

logical; if TRUE and missing is either "Imputation (replace missing values with estimates)" or "Multiple imputation", imputation is performed on both the full data and on the requested subset of data; otherwise, imputation is only performed on the subset. Ignored for other options of missing.

Details

Removes any empty levels from factors.

Value

A list with components

  • estimation.data - tidied (filtered/subsetted and NA-free) data.frame

  • weights - the cleaned weights with any filters applied(i.e. the weights with NA and negative weights set to 0),

  • unfiltered.weights - the cleaned weights from the complete data (i.e. with no filter applied)

  • post.missing.data.estimation.sample - logical vector with length equal to the number of rows of data with a TRUE value in position i indicating that the ith row of data appears in the tidied data estimation.data

  • data - original data (without subset applied), but with imputation performed (if requested)

  • description - character; description of the data; see SampleDescription

See Also

Imputation, SampleDescription, EstimationDataTemplate


NumbersInternational/flipData documentation built on March 2, 2024, 10:52 a.m.