seqaddNA: Generation of missing on longitudinal categorical data.

View source: R/missings_generation.R

seqaddNAR Documentation

Generation of missing on longitudinal categorical data.

Description

Generation of missing data under the form of gaps, which is the typical form of missing data with longitudinal data. It simulates MCAR or MAR missing data.

Usage

seqaddNA(
  data,
  var = NULL,
  states.high = NULL,
  propdata = 1,
  pstart.high = 0.1,
  pstart.low = 0.005,
  maxgap = 3,
  only.traj = FALSE
)

Arguments

data

a data frame containing sequences of a multinomial variable with missing data (coded as NA)

var

the list of columns containing the trajectories. Default is NULL, i.e. all the columns.

states.high

list of states that have a larger probability of triggering a subsequent missing data gap

propdata

proportion observations for which missing data is simulated

pstart.high

probability to start a missing data for the states specified with the states.high argument

pstart.low

probability to start a missing data for the other states

maxgap

maximum length of a missing data gap

only.traj

logical that specifies whether only the trajectories should be returned (only.traj=TRUE), or the whole data (only.traj=FALSE)

Value

Returns a data frame on which missing data were simulated

Author(s)

Kevin Emery

Examples

# Generate MCAR missing data on the mvad dataset 
# from the TraMineR package

## Not run: 
data(mvad, package = "TraMineR")
mvad.miss <- seqaddNA(mvad, var = 17:86)


# Generate missing data on mvad where joblessness is more likely to trigger 
# a missing data gap
mvad.miss2 <- seqaddNA(mvad, var = 17:86,  states.high = "joblessness")

## End(Not run)


seqimpute documentation built on May 29, 2024, 4:35 a.m.