archetypoids | R Documentation |
Archetypoid algorithm. It is based on the PAM clustering algorithm. It is made up of two phases (a BUILD phase and a SWAP phase). In the BUILD phase, an initial set of archetypoids is determined. Unlike PAM, this collection is not derived in a stepwise format. Instead, it is suggested you choose the set made up of the nearest individuals returned by the archetypes
function of the archetypes R package (Eugster et al. (2009)). This set can be defined in three different ways, see next section arguments. The goal of the SWAP step is the same as that of the SWAP step of PAM, but changing the objective function. The initial vector of archetypoids is attempted to be improved. This is done by exchanging selected individuals for unselected individuals and by checking whether these replacements reduce the objective function of the archetypoid analysis problem.
All details are given in Vinue et al. (2015).
archetypoids(numArchoid,data,huge=200,step,init,ArchObj,nearest="cand_ns",sequ,aux)
numArchoid |
Number of archetypoids (archetypal observations). |
data |
Data matrix. Each row corresponds to an observation and each column corresponds to an anthropometric variable. All variables are numeric. |
huge |
This is a penalization added to solve the convex least squares problems regarding the minimization problem to estimate archetypoids, see Eugster et al. (2009). Default value is 200. |
step |
Logical value. If TRUE, the archetypoid algorithm is executed repeatedly within |
init |
Initial vector of archetypoids for the BUILD phase of the archetypoid algorithm. It is computed within |
ArchObj |
The list object returned by the |
nearest |
Initial vector of archetypoids for the BUILD phase of the archetypoid algorithm. Required when |
sequ |
Logical value. It indicates whether a sequence of archetypoids (TRUE) or only a single number of them (FALSE) is computed. It is determined by the number of archetypes computed by means of |
aux |
If |
As mentioned, this algorithm is based on PAM. These types of algorithms aim to find good solutions in a short period of time, although not necessarily the best solution. Otherwise, the global minimum solution may always be obtained using as much time as it would be necessary, but this would be very inefficient computationally.
A list with the following elements:
cases: Anthropometric cases (final vector of numArchoid
archetypoids).
rss: Residual sum of squares corresponding to the final vector of numArchoid
archetypoids.
archet_ini: Vector of initial archetypoids (cand_ns, cand_alpha or cand_beta).
alphas: Alpha coefficients for the optimal vector of archetypoids.
It may be happen that archetypes
does not find results for numArchoid
archetypes. In this case, it is not possible to calculate the vector of nearest individuals and consequently, the vector of archetypoids. Therefore, this function will return an error message.
Irene Epifanio and Guillermo Vinue
Vinue, G., Epifanio, I., and Alemany, S., (2015). Archetypoids: a new approach to define representative archetypal data, Computational Statistics and Data Analysis 87, 102–115.
Cutler, A., and Breiman, L., (1994). Archetypal Analysis, Technometrics 36, 338–347.
Epifanio, I., Vinue, G., and Alemany, S., (2013). Archetypal analysis: contributions for estimating boundary cases in multivariate accommodation problem, Computers & Industrial Engineering 64, 757–765.
Eugster, M. J., and Leisch, F., (2009). From Spider-Man to Hero - Archetypal Analysis in R, Journal of Statistical Software 30, 1–23, doi: 10.18637/jss.v030.i08.
Eugster, M. J. A., (2012). Performance profiles based on archetypal athletes, International Journal of Performance Analysis in Sport 12, 166–187.
stepArchetypesRawData
, archetypes
, stepArchetypoids
#Note: For a sportive example, see www.uv.es/vivigui/softw/more_examples.R #COCKPIT DESIGN PROBLEM: #As a toy example, only the first 25 individuals are used. USAFSurvey_First25 <- USAFSurvey[1:25, ] #Variable selection: variabl_sel <- c(48, 40, 39, 33, 34, 36) #Changing to inches: USAFSurvey_First25_inch <- USAFSurvey_First25[,variabl_sel] / (10 * 2.54) #Data preprocessing: USAFSurvey_preproc <- preprocessing(USAFSurvey_First25_inch, TRUE, 0.95, TRUE) #For reproducing results, seed for randomness: #suppressWarnings(RNGversion("3.5.0")) #set.seed(2010) #Run archetype algorithm repeatedly from 1 to numArch archetypes: #This is a toy example. In other situation, choose numArch=10 and numRep=20. numArch <- 5 ; numRep <- 2 lass <- stepArchetypesRawData(data = USAFSurvey_preproc$data, numArch = 1:numArch, numRep = numRep, verbose = FALSE) #To understand the warning messages, see the vignette of the #archetypes package. #screeplot(lass) numArchoid <- 3 #number of archetypoids. res_ns <- archetypoids(numArchoid, USAFSurvey_preproc$data, huge = 200, step = FALSE, ArchObj = lass, nearest = "cand_ns",sequ = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.