mar: MAR: Missing at Random

Description Usage Arguments Details Examples

View source: R/mar.R

Description

mar() alows the user to forceibly input missing values (NA) that replicates being missing at random. This kind of missing data depends on the value of another variable in the dataset. The function uses the uniform distribution to act as the variable that the missing data is dependent on. If Ui > p then there is q% chance your value will is missing in the i'th spot.

Usage

1
mar(data, p, q, column)

Arguments

data

the dataframe that you want to input NAs into

p

Value to compare the uniform to. A choice of .5 will make the expression Ui>p true about half of the time.

q

percent chance that the i'th value will be missing given Ui > p.

column

The column(s) in the data that you want to give missing values to. For multiple colums use c() with the specifed column choices.

Details

Definintly play around with p and q if you want to get a specfic amount of data missing.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Inputs missing data in the first two columns of the data set df

df<- data.frame(x=rnorm(100, 10, 2), y=rpois(100,4), z=rbinom(100, 1, .4))
df_missing<- mar(data = df, p = .5, q = .2, column = 1:2)
sum(is.na(df_missing))/200

## Inputs missing data into all of the columns in df2

df2<- data.frame(x=rnorm(100, 10, 2), y=rpois(100,4), z=rbinom(100, 1, .4))
df_missing2<- mcar(data = df, p = .5, q = .2)
sum(is.na(df_missing2))/300
{
  }

JerryTucay/mfdata documentation built on May 7, 2019, 6:56 p.m.