build_dataset: Generate a multilevel long-form data frame

Description Usage Arguments Details Value Examples

Description

This function builds a long-form data set with n observations from j units.

Usage

1
2
3
build_dataset(n, j, fixed_coef, random_coef_sd, resid_sd,
  n_level_2_var = 2, mean_fixed_level_2 = 0, sd_fixed_level_2 = 1,
  mean_fixed_level_1 = 0, sd_fixed_level_1 = 1)

Arguments

n

Number of observations.

j

Number of individuals.

fixed_coef

Vector with fixed effects coefficients.

random_coef_sd

Vector with standard deviations of the random effects.

resid_sd

Scalar, residual variance.

n_level_2_var

Number of level 2 variables.

mean_fixed_level_2

Means of the fixed effects covariates, level 2.

sd_fixed_level_2

Standard Deviation of the fixed effects covariates, level 2.

mean_fixed_level_1

Means of the fixed effects covariates, level 1.

sd_fixed_level_1

Standard deviation of the fixed effects covariates, level 1.

Details

The function creates a data frame containing variables id, y, and V# with the number of fixed effects. The user can specify the number of level 2 covariates. The data are generated in two steps, first the level 2 data are generated, which are constant per unit then the ordering of units is created using the sample function, with replace = TRUE. This results in unequal number of observations per unit, altough each unit has equal probabilty of being sampled. Lastly, level 1 data are generated by drawing from the normal distribution.

Value

A data frame with variable id, which labels the units, y is the outcome or dependent variable and covariates.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## We create a dataset, consisting of 2500 observations from 20 
## units. The fixed effects have the coefficients 1, 2, 3, 4, and 5. The 
## variance of the random effects equals 1, 4, and 9. Lastly the 
## residual variance equals 4:
  
test_data <- build_dataset(n = 2500, 
                           j = 20, 
                           fixed_coef = 1:5, 
                           random_coef_sd = 1:3, 
                           resid_sd = 2)

L-Ippel/SEMA documentation built on May 30, 2019, 8:23 a.m.