| tclustregIC | R Documentation |
tclustreg for different number of groups k
and restriction factors c.The last two letters stand for 'Information Criterion'. This function computes
the values of BIC (MIXMIX), ICL (MIXCLA) or CLA (CLACLA), for different values
of k (number of groups) and different values of c
(restriction factor for the variances of the residuals), for
a prespecified level of trimming. In order to minimize randomness, given k,
the same subsets are used for each value of c.
tclustregIC(
y,
x,
alphaLik = 0,
alphaX = 1,
intercept = TRUE,
whichIC = c("ALL", "MIXMIX", "MIXCLA", "CLACLA"),
kk = 1:5,
cc = c(1, 2, 4, 8, 16, 32, 64, 128),
ccSigmaX = 12,
plot = FALSE,
nsamp,
refsteps = 10,
reftol = 1e-13,
equalweights = FALSE,
we,
msg = TRUE,
nocheck = FALSE,
RandNumbForNini,
startv1 = 1,
UnitsSameGroup,
commonslope = FALSE,
Ysave = TRUE,
trace = FALSE,
...
)
y |
Response variable. A vector with |
x |
An n x p data matrix (n observations and p variables). Rows of x represent observations, and columns represent variables. Missing values (NA's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations. |
alphaLik |
Trimming level, a number between 0 and 0.5 or an
integer number specifying the number of observations which have to be trimmed.
If |
alphaX |
Second-level trimming or constrained weighted model for
|
intercept |
wheather to use constant term (default is |
whichIC |
A character value which specifies which information criteria must be computed
for each
|
kk |
an integer vector specifying the number of mixture components (clusters)
for which the information criteria are be calculated. By default |
cc |
a vector specifying the values of the restriction factor which have to
be considered for the variances of the residuals of the regression lines.
By default |
ccSigmaX |
A number specifying the value of the restriction factor which has to be
considered for the covariance matrices of the explanatory variables. The default value
is |
plot |
If |
nsamp |
If a scalar, it contains the number of subsamples which will be extracted.
If |
refsteps |
Number of refining iterations in each subsample. Default is |
reftol |
Tolerance of the refining steps. The default value is 1e-14 |
equalweights |
A logical specifying wheather cluster weights in the concentration
and assignment steps shall be considered. If |
we |
Weights. A vector of size n-by-1 containing application-specific weights Default is a vector of ones. |
msg |
Controls whether to display or not messages on the screen If |
nocheck |
Check input arguments. If |
RandNumbForNini |
pre-extracted random numbers to initialize proportions.
Matrix of size k-by-nrow(nsamp) containing the random numbers which
are used to initialize the proportions of the groups. This option is effective only if
|
startv1 |
How to initialize centroids and covariance matrices. Scalar.
If Remark 1: in order to start with a routine which is in the required parameter space, eigenvalue restrictions are immediately applied. Remark 2 - option |
UnitsSameGroup |
List of the units which must (whenever possible) have
a particular label. For example |
commonslope |
wheather to impose a constraint of common slope on the regression coefficients.
If |
Ysave |
weather to save on output the unput response variable |
trace |
Whether to print intermediate results. Default is |
... |
potential further arguments passed to lower level functions. |
An S3 object of class tclustregic which is basically a list with the following componnts
call the matched call
CLACLA A matrix of size 5-times-8 if kk and cc are not
specififed else it is a matrix of size length(kk)-times-length(cc)
containinig the value of the penalized classification likelihood.
This output is present only if whichIC="CLACLA") or whichIC="ALL").
IDXCLA array of size 5-times-8 if kk and cc are not
specififed else it is an array of size length(kk)-times-length(cc).
Each element of the array is a list with one element which is a vector of length n containinig the assignment
of each unit using the classification model. This output is present only
if whichIC="CLACLA") or whichIC="ALL").
MIXMIX A matrix of size 5-times-8 if kk and cc are not
specififed else it is a matrix of size length(kk)-times-length(cc)
containinig the value of the penalized mixture likelihood.
This output is present only if whichIC="MIXMIX") or whichIC="ALL").
MIXCLA A matrix of size 5-times-8 if kk and cc are not
specififed else it is a matrix of size length(kk)-times-length(cc)
containinig the value of the ICL.
This output is present only if whichIC="MIXCLA") or whichIC="ALL").
IDXMIX array of size 5-times-8 if kk and cc are not
specififed else it is an array of size length(kk)-times-length(cc).
Each element of the array is a list with one element which is a vector of length n containinig the assignment
of each unit using the mixture model. This output is present only
if whichIC="MIXMIX"), whichIC="MIXCLA") or whichIC="ALL").
kk a vector containing the values of k (number of components) which have been considered.
This vector is identical to the argument kk (default is kk=1:5.
cc a vector containing the values of c (values of the restriction factor) which
have been considered for the variance of the residuals. This vector is identical
to the argument cc (defalt is cc=c(1, 2, 4, 8, 16, 32, 64, 128).
ccSigmaX values of the restriction factor which
have been considered for the covariance matrices of the esplnatory variables.
This vector is identical the argument ccsigmaX.
alpha the trimming level which has been used in the likelidood (it stores the values of input alphaLik).
alphaX second-level trimming or constrained weighted model for X.
X original data matrix of explanatory variables. Present if Ysave=TRUE.
y original vector containing the response. Present if Ysave=TRUE.
FSDA team, valentin.todorov@chello.at
Torti F., Perrotta D., Riani, M. and Cerioli A. (2019). Assessing Robust Methodologies for Clustering Linear Regression Data, Advances in Data Analysis and Classification, Vol. 13, pp 227-257.
## Not run:
## The X data have been introduced by Gordaliza, Garcia-Escudero & Mayo-Iscar (2013).
## The dataset presents two parallel components without contamination.
data(X)
y1 = X[, ncol(X)]
X1 = X[,-ncol(X), drop=FALSE]
(out <- tclustregIC(y1, X1, plot=TRUE))
tclustICplot(out, whichIC="MIXMIX")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.