Description Usage Arguments Details Examples
mar() alows the user to forceibly input missing values (NA) that replicates being missing at random. This kind of missing data depends on the value of another variable in the dataset. The function uses the uniform distribution to act as the variable that the missing data is dependent on. If Ui > p then there is q% chance your value will is missing in the i'th spot.
1 |
data |
the dataframe that you want to input NAs into |
p |
Value to compare the uniform to. A choice of .5 will make the expression Ui>p true about half of the time. |
q |
percent chance that the i'th value will be missing given Ui > p. |
column |
The column(s) in the data that you want to give missing values to. For multiple colums use c() with the specifed column choices. |
Definintly play around with p and q if you want to get a specfic amount of data missing.
1 2 3 4 5 6 7 8 9 10 11 12 13 | ## Inputs missing data in the first two columns of the data set df
df<- data.frame(x=rnorm(100, 10, 2), y=rpois(100,4), z=rbinom(100, 1, .4))
df_missing<- mar(data = df, p = .5, q = .2, column = 1:2)
sum(is.na(df_missing))/200
## Inputs missing data into all of the columns in df2
df2<- data.frame(x=rnorm(100, 10, 2), y=rpois(100,4), z=rbinom(100, 1, .4))
df_missing2<- mcar(data = df, p = .5, q = .2)
sum(is.na(df_missing2))/300
{
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.