weightedEM: Estimates location and scatter on incomplete data with case...

View source: R/cellwiseWeights.R

weightedEMR Documentation

Estimates location and scatter on incomplete data with case weights

Description

Carries out a rowwise weighted EM algorithm to estimate mu and Sigma of incomplete Gaussian data.

Usage

weightedEM(X, w=NULL, lmin=NULL, crit=1e-4, 
                      maxiter=1000, initEst=NULL, computeloglik=F)

Arguments

X

n by d data matrix or data frame.

w

vector with n nonnegative rowwise (casewise) weights. If NULL, all weights are set to 1 so an unweighted EM is carried out.

lmin

if not NULL, a lower bound on the eigenvalues of the estimated EM covariance matrix on the standardized data, to avoid singularity.

crit

convergence criterion of successive mu and Sigma estimates.

maxiter

maximal number of iteration steps.

initEst

if not NULL, a list with initial estimates $mu of the mean, $Sigma of the covariance matrix.

computeloglik

if TRUE, the log(likelihood) is computed in every step and reported. Default is FALSE to save computation time.

Value

A list with components:

  • mu
    the estimated location vector.

  • Sigma
    the estimated covariance matrix.

  • impX
    the imputed data matrix.

  • niter
    the number of iteration steps taken.

  • loglikhd
    vector with the total log(likelihood) at every iteration step. When computeloglik = FALSE this array contains NA's.

Author(s)

P.J. Rousseeuw

References

P.J. Rousseeuw (2023). Analyzing cellwise weighted data. Econometrics and Statistics, appeared online. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.ecosta.2023.01.007")}(link to open access pdf)

See Also

unpack, cwLocScat

Examples


Sigma <- matrix(0.7, 3, 3); diag(Sigma) <- 1
set.seed(12345); X <- MASS::mvrnorm(1000, rep(0, 3), Sigma)
X[1, 3] <- X[2, 2] <- X[3, 1] <- X[4, 1] <- X[5, 2] <- NA
w <- runif(1000, 0, 1) # rowwise weights
out <- weightedEM(X, w, crit = 1e-12, computeloglik = TRUE)
out$niter # number of iteration steps taken
plot(1:out$niter, out$loglikhd[1:out$niter], type = 'l',
     lty = 1, col = 4, xlab = 'step', ylab = 'log(likelihood)',
     main = 'log(likelihood) of weighted EM iterations')
out$mu # estimated center
round(out$Sigma, 6) # estimated covariance matrix
head(X) # the data has NA's
head(out$impX) # imputed data, has no NA's

# For more examples, we refer to the vignette:
## Not run: 
vignette("cellwise_weights_examples")

## End(Not run)

cellWise documentation built on Oct. 25, 2023, 5:07 p.m.