emmixwire: The EMMIX model with random effects

Description Usage Arguments Details Value References See Also Examples

Description

The function emmixwire fits the data with the specified EMMIX-WIRE model as in [1].

Usage

1
2
3
4
emmixwire(dat, g = "BIC", maxg = 10, ncov = 3, nvcov = 0, n1 = 0,
  n2 = 0, n3 = 0, X = NULL, W = NULL, U = NULL, V = NULL,
  cluster = NULL, init = NULL, debug = FALSE, itmax = 1000,
  epsilon = 1e-05, nkmeans = 5, nrandom = 0)

Arguments

dat

The dataset, an n by m numeric matrix, where n is number of observations and m the dimension of data.

g

The number of components in the mixture model, or 'BIC' for automatic selection using minimum BIC. It is also possible to give a list of candidate g values.

maxg

The maxium number of components in the mixture model if using 'BIC' for automatic selection.

ncov

A small integer indicating the type of covariance structure of item b .

nvcov

0 or 1, indicating whether or not to include random item c in the model.

n1

The number of samples in class 1.

n2

The number of samples in class 2.

n3

The number of samples in class 3.

X

The design matrix X

W

The design matrix W

U

The design matrix U

V

The design matrix V

cluster

A vector of integers specifying the initial partitions of the data; the default is NULL.

init

A list containing the initial parameters for the mixture model. See details. The default value is NULL.

debug

A logical value, if it is TRUE, theoutput will be printed out; FALSE silent; the default value is FALSE.

itmax

A big integer specifying the maximum number of iterations to apply; the default value is 1000.

epsilon

A small number used to stop the EM algorithm loop when the relative difference between log-likelihood at each iteration become sufficient small; the default value is 1e-5.

nkmeans

An integer to specify the number of KMEANS partitions to be used to find the best initial values.

nrandom

An integer to specify the number of random partitions to be used to find the best initial values.

Details

The combination of ncov and nvcov defines the covariance structure of the random effect item b and c respectively.#'For example, when both ncov and nvcov are zeros, #'it is a mixtue of normal regression model. Specifically, when ncov=0, random effect b is ignored; when ncov=1 or 2, all components share a common covariance of random item b; when ncov=2, the common covariance matrix is diagonal; when ncov=3 or 4, each component has their own covariance matrix of item b; when ncov=4, the covariance matrices of item b are diagonal; when ncov=5, the covariance matrices of item b are sigma(h)*I(m)(diagonal scale parameter matrix with same identical diagonal element values).

Value

A list containing the following:

error

Error code, 0= normal exit; 1= did not converge within itmax iterations

loglik

The log likelihood at convergence

np

The total number of parameters

AIC

Akaike Information Criterion (AIC)

BIC

Bayes Information Criterion (BIC)

pro

A vector of mixing proportions

beta

A numeric matrix with each column corresponding to the location parameter.

sigma.e

The covariance parameters of error item e

sigma.b

The covariance parameters of random item b

sigma.c

The covariance parameters of random item c

cluster

A vector of final partition

eb

The conditional expectation of random item b

ec

The conditional expectation of random item c

lk

A vector of log likelihood at each EM iteration

tau

An n by g matrix of posterior probability for each data point

X

The design matrix X

W

The design matrix W

U

The design matrix U

V

The design matrix V

g

The number of components in the mixture model

References

[1] Ng, S. K., McLachlan, G. J., Wang, K., Nagymanyoki, Z., Liu, S., & Ng, S. W. (2014). Inference on differences between classes using cluster-specific contrasts of mixed effects. Biostatistics, 16(1), 98-112.

See Also

scores.wire

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
data(expr.norm)
data(mapping.unique)

dat=  expr.norm
map=  mapping.unique
#map= 1: S/C=1,  2535;
#map=-1: "empties",  10131;
#map=-2: "mixed",  ambiguous,  13,  excluded;
#map> 1: 1331 nonnulls,   S/C > 1.

# in summary,  
#1331 (9.5 percent) nonnulls;
#and 
#12, 666 (90.5 percent) true nulls.
#---------------------
dat=  as.matrix(dat[map>=-1, ])
map=  map[map>=-1]
#---------------------

y  <- log(dat)
set.seed(123)
ret <- emmixwire(y, g=3, ncov=3, nvcov=1, n1=3, n2=3, n3=0, 
        debug=0, itmax=20, epsilon=1e-5, nkmeans=5)

### alternatively,  
#X <- U <- cbind(c(1, 1, 1, 0, 0, 0), c(0, 0, 0, 1, 1, 1))
#m<-6   # m is number of columns
#V<-diag(m)
#W <-rep(1, m)
#ret <- emmixwire(y, g=3, ncov=3, nvcov=1, X=X, W=W, U=U, V=V, 
#    debug=0, itmax=20, epsilon=1e-5, nkmeans=5)

###calculate the weighted contrast W_j
wj <- scores.wire(ret)
names(wj) <- names(map)
###top 1000 genes
wire.1000 <- names(map)[order(abs(wj), decreasing=TRUE)][seq_len(1000)]
###the number of false non-nulls in the top 1000 genes 
sum(map[wire.1000] == 1) + sum( map[wire.1000] == -1)
#119

##alternatively
### the null distribution of W_j
wj0 <- wj2.permuted(y, ret, nB=19)
pv  <- pvalue.wire(wj, wj0)
wire.1000 <- names(map)[order(pv, decreasing=0)][seq_len(1000)]
###the number of false non-nulls in the top 1000 genes 
sum(map[wire.1000] == 1) + sum( map[wire.1000] == -1)
#119
hist(pv, 50)

andrewthomasjones/EMMIXcontrasts documentation built on June 26, 2019, 5:46 a.m.