fdiscd.predict: Predicting the class of a group of individuals with...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/fdiscd.predict.R


Allocates several groups of individuals, one group after another, to one class of groups (among K classes of groups) using the L^2 distances between the density function associated to the group to allocate and the density functions associated to the K classes.


fdiscd.predict(xf, class.var, crit = 1, gaussiand = TRUE, kern = NULL, windowh = NULL, 
               misclass.ratio = FALSE)



object of class folderh with two data frames:

  • The first one has at least two columns. One column contains the names of the T groups (all the names must be different). An other column is a factor with K levels partitionning the T groups into K classes..

  • The second one has (p+1) columns. The first p columns are numeric (otherwise, there is an error). The last column is a factor with T levels defining T groups. Each group, say t, consists of n_t individuals.

Notice that for the versions earlier than 2.0, fdiscd.predict applied to two data frames.


string. The name of the class variable.


1, 2 or 3. In order to select the densities associated to the classes. See Details.


logical. If TRUE (default), the probability densities are supposed Gaussian. If FALSE, densities are estimated using the Gaussian kernel method.


string. If gaussiand = FALSE, this argument sets the kernel used in the estimation method. Currently, only the Gaussian kernel is available: the settings kern = "gauss" and kern = NULL are equivalent.


strictly positive number. If windowh = NULL (default), the bandwidths are computed using the bandwidth.parameter function.


logical (default FALSE). If TRUE, the confusion matrix and misclassification ratio are computed on the groups whose prior class is known. In order to compute the misclassification ratio by the one-leave-out method, use the fdiscd.misclass function.


To the group t is associated the density denoted f_t. To the class k consisting of T_k groups is associated the density denoted g_k. The crit argument selects the estimation method of the K densities g_k.

  1. The density g_k is estimated using the whole data of this class, that is the rows of x corresponding to the T_k groups of the class k.

  2. The T_k densities f_t are estimated using the corresponding data from x. Then they are averaged to obtain an estimation of the density g_k, that is g_k = (1/T_k)∑{f_t}.

  3. Each previous density f_t is weighted by n_t (the number of rows of x corresponding to f_t). Then they are averaged, that is g_k = (1/∑ n_t) ∑ n_t f_t.


Returns an object of class fdiscd.predict, that is a list including:


data frame with 3 columns:

  • factor giving the group name. The column name is the same as that of the column (p+1) of x,

  • class.known: the prior class of the group if it is available, or NA if not,

  • class.predict: the class allocation predicted by the discriminant analysis method. If misclass.ratio = TRUE, the class allocations are computed for all groups. Otherwise (default), they are computed only for the groups whose class is unknown.


matrix with T rows and K columns, of the distances (d_{tk}): d_{tk} is the distance between the group t and the class k,


matrix of the proximities (in percents). The proximity of a group t to the class k is computed as so: (1/d_{tk})/∑_{l=1}^{l=K}(1/d_{tl}).


the confusion matrix (if misclass.ratio = TRUE)


the misclassification ratio (if misclass.ratio = TRUE)


Rachid Boumaza, Pierre Santagostini, Smail Yousfi, Gilles Hunault, Sabine Demotes-Mainard


Boumaza, R. (2004). Discriminant analysis with independently repeated multivariate measurements: an L^2 approach. Computational Statistics & Data Analysis, 47, 823-843.

Rudrauf, J.M., Boumaza, R. (2001). Contribution <e0> l'<e9>tude de l'architecture m<e9>di<e9>vale: les caract<e9>ristiques des pierres <e0> bossage des ch<e2>teaux forts alsaciens. Centre de Recherches Arch<e9>ologiques M<e9>di<e9>vales de Saverne, 5, 5-38.


castles.stones <- rbind(castles.dated$stones, castles.nondated$stones)
castles.periods <- rbind(castles.dated$periods, castles.nondated$periods)
castlesfh <- folderh(castles.periods, "castle", castles.stones)
result <- fdiscd.predict(castlesfh, "period")

dad documentation built on Jan. 4, 2018, 5:12 p.m.