fitClere: fitClere function

Description Usage Arguments Value See Also Examples

View source: R/fitClere.R

Description

This function runs the CLERE Model. It returns an object of class Clere. For more details please refer to clere.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
fitClere(
  y,
  x,
  g = 1,
  nItMC = 50,
  nItEM = 1000,
  nBurn = 200,
  dp = 5,
  nsamp = 200,
  maxit = 500,
  tol = 0.001,
  nstart = 2,
  parallel = FALSE,
  seed = NULL,
  plotit = FALSE,
  sparse = FALSE,
  analysis = "fit",
  algorithm = "SEM",
  theta0 = NULL,
  Z0 = NULL
)

Arguments

y

[numeric]: The vector of observed responses - size n.

x

[matrix]: The matrix of predictors - size n rows and p columns.

g

[integer]: Either the number or the maximum of groups for fitting CLERE. Maximum number of groups is considered when model selection is required.

nItMC

[numeric]: Number of Gibbs iterations to generate the partitions. After the nBurn iterations, this number is automatically set to 1.

nItEM

[numeric]: Number of SEM iterations.

nBurn

[numeric]: Number of SEM iterations discarded before calculating the MLE which is averaged over SEM draws.

dp

[numeric]: Number of iterations between sampled partitions when calculating the likelihood at the end of the run.

nsamp

[numeric]: Number of sampled partitions for calculating the likelihood at the end of the run.

maxit

[numeric]: An EM algorithm is used inside the SEM to maximize the complete log-likelihood p(y, Z|theta). maxit stands as the maximum number of EM iterations for the internal EM.

tol

[numeric]: Maximum increased in complete log-likelihood for the internal EM (stopping criterion).

nstart

[integer]: Number of random starting points to be used for fitting the model.

parallel

[logical]: Should the estimation from nstart random starting points run in parallel?

seed

[integer]: An integer given as a seed for random number generation. If set to NULL, then a random seed is generated between 1 and 1000.

plotit

[logical]: Should a summary plot (base plot) be drawn after the run?

sparse

[logical]: Should a 0 class be imposed to the model?

analysis

[character]: Which analysis is to be performed. Values are "fit", "bic", "aic" and "icl".

algorithm

[character]: The algorithm to be chosen to fit the model. Either the SEM-Gibbs algorithm or the MCEM algorithm. The most efficient algorithm being the SEM-Gibbs approach. MCEM is not available for binary response.

theta0

[vector(numeric)]: An initial guess of the model parameters. When considering g components, the length of theta0 must be 2*g+3 and theta0 should be filled as intercept, the b_k's (g real numbers), the pi_k's (g real numbers summing to 1), sigma^2 and gamma^2 (two positive numbers).

Z0

[vector(integer)]: A vector of integers representing an initial partition for the variables. For 10 variables and 3 groups Z0 can be defined as \ Z0 = c(rep(0, 2), rep(1, 3), rep(2, 5)).

Value

Object of class Clere.

See Also

Overview : clere-package
Classes : Clere, Pacs
Methods : plot, clusters, predict, summary
Functions : fitClere, fitPacs Datasets : numExpRealData, numExpSimData, algoComp

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
library(clere)
plotit    <- FALSE
sparse    <- FALSE
nItEM     <- 100
nBurn     <- nItEM / 2
nsamp     <- 100
analysis  <- "fit"
algorithm <- "SEM"
nItMC     <- 1
dp        <- 2
maxit     <- 200
tol       <- 1e-3

n         <- 50
p         <- 50
intercept <- 0
sigma     <- 10
gamma     <- 10
rho       <- 0.5

g         <- 5 
probs     <- c(0.36, 0.28, 0.20, 0.12, 0.04)
Eff       <- p * probs
a         <- 5
B         <- a**(0:(g-1))-1
Z         <- matrix(0, nrow = p, ncol = g)
imax      <- 0
imin      <- 1

for (k in 1:g) {
    imin <- imax+1
    imax <- imax+Eff[k]
    Z[imin:imax, k] <- 1
}
Z <- Z[sample(1:p, p), ]
if (g>1) {
    Beta <- rnorm(p, mean = c(Z%*%B), sd = gamma)
} else {
    Beta <- rnorm(p, mean = B, sd = gamma)
}

theta0 <- NULL # c(intercept, B, probs, sigma^2, gamma^2)
Z0     <- NULL # apply(Z, 1, which.max)-1

gmax <- 7

## Prediction
eps  <- rnorm(n, mean = 0, sd = sigma)
X    <- matrix(rnorm(n*p), nrow = n, ncol = p)
Y    <- as.numeric(intercept+X%*%Beta+eps)
tt   <- system.time(mod <- fitClere(y = Y, x = X, g = gmax, 
                        analysis = analysis,algorithm = algorithm,
                        plotit = plotit, 
                        sparse = FALSE,nItEM = nItEM, 
                        nBurn = nBurn, nItMC = nItMC, 
                        nsamp = nsamp, theta0 = theta0, Z0 = Z0) )
plot(mod)
Yv <- predict(object = mod, newx = X)

clere documentation built on Feb. 7, 2020, 1:06 a.m.