missingmat: Random generation of missing values

Description Usage Arguments Details Value Author(s) References Examples

Description

Random generation of missing values in matrices of numerical data or preferably categorical data coded as integers

Usage

1
missingmat(mat, nummissing, pattern =  "r", nk = 1, p = 0.1, w = 3)

Arguments

mat

A matrix of numerical values

nummissing

number of missing values

pattern

pattern of missing values ("r" random, "l" lowest value, "b" block, "n" not at random)

nk

category

p

percentage of missing values

w

weight for the lowest category in pps sampling (pattern "n")

Details

The function generates random missing values on a matrix of categorical data according to a specific pattern. "r" is the random pattern, "l" generates a percentage p of missing values on the lowest values of variable nk, "b" generates random blocks of missing values on the group of variables indexed by nk, "n" generates a kind of not at random missing values: specifically, lowest values are more likely to be missing, since they are assigned a weight w (greater than 1, the default is 3) and the values are sampled according to an unequal probability sampling design (pivotal, see the reference for more details)

Value

The original matrix with the desired number of values randomly substituted by missing values

Author(s)

Alessandro Barbiero, Giancarlo Manzi, Pier Alda Ferrari

References

Ferrari P.A., Annoni P., Barbiero A., Manzi G. (2011) An imputation method for categorical variables with application to nonlinear principal component analysis, Computational Statistics & Data Analysis, vol. 55, issue 7, pages 2410-2420, http://www.sciencedirect.com/science/article/pii/S0167947311000521

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
set.seed(1)
# correlation matrix
sigma<-matrix(c(1,0.5,0.5,0.5,0.5,1,0.5,0.5,0.5,0.5,1,0.5,0.5,0.5,0.5,1),4,4)
# generate a n*m matrix from a multivariate normal
n<-500
m<-4
matc<-rmvnorm(n, mean=rep(0,m), sigma=sigma)
# transform the numerical values into ordinal categories (Likert scale)
# obtaining matrix mato
mato<-transfmatcat(matc,c(2,3,4,5))
# set the number of desired missing values
nummissing<-150
# create the random missing values
# random missing values
matc<-missingmat(mato, nummissing, pattern= "r")
matc
# random blocks of missing values on variables 1,2 and 3
matc<-missingmat(mato, nummissing, pattern= "b", nk=c(2,3))
matc
# missing values on lowest category of variable 4
matl<-missingmat(mato, nummissing, pattern= "l", nk=4, p=0.1)
matl
# not at random missing values on variable 4
matn<-missingmat(mato, nummissing, pattern= "n", nk=4, w=4)
matn

ForImp documentation built on May 2, 2019, 8:17 a.m.