Description Usage Arguments Value Author(s) Examples

This function performs a classification algorithm on a dataset with ordinal features, and a label variable that belongs to (1,2,...,kr). The classification function provides two classification models. The first model, (chosen by the argument kc=0), is a multivariate BOS model with the assumtion that, conditional on the class of the observations, the features are independent. The second model is a parsimonious version of the first model. Parsimony is introduced by grouping the features into clusters (as in co-clustering) and assuming that the features of a cluster have a common distribution.

1 2 | ```
bosclassif(x, y, idx_list=c(1), kr, kc=0, init, nbSEM, nbSEMburn,
nbindmini, m=0, percentRandomB=0)
``` |

`x` |
Matrix made of ordinal data of dimension N*Jtot. The features with same numbers of levels must be placed side by side. The missing values should be coded as NA. |

`y` |
Vector of length N. It should represent the classes corresponding to each row of x. Must be labeled with numbers (1,2,...,kr). |

`idx_list` |
Vector of length D. This argument is useful when variables have different numbers of levels. Element d should indicate where the variables with number of levels m[d] begin in matrix x. |

`kr` |
Number of row classes. |

`kc` |
Vector of length D. The d^th element indicates the number of column clusters. Set to 0 to choose a classical multivariate BOS model. |

`m` |
Vector of length D. The d^th element defines the number of levels of the ordinal data. |

`nbSEM` |
Number of SEM-Gibbs iterations realized to estimate parameters. |

`nbSEMburn` |
Number of SEM-Gibbs burn-in iterations for estimating parameters. This parameter must be inferior to nbSEM. |

`nbindmini` |
Minimum number of cells belonging to a block. |

`init` |
String that indicates the kind of initialisation. Must be one of the following strings: "kmeans", "random" or "randomBurnin". |

`percentRandomB` |
Vector of length 1. Indicates the percentage of resampling when init is equal to "randomBurnin". |

Return an object. The slots are:

`@zr` |
Vector of length N with resulting row partitions. |

`@zc` |
List of length D. The d^th item is a vector of length J[d] representing the column partitions for the group of variables d. |

`@J` |
Vector of length D. The d^th item represents the number of columns for d^th group of variables. |

`@W` |
List of length D. Item d is a matrix of dimension J*kc[d] such that W[j,h]=1 if j belongs to cluster h. |

`@V` |
Matrix of dimension N*kr such that V[i,g]=1 if i belongs to cluster g. |

`@icl` |
ICL value for co-clustering. |

`@kr` |
Number of row classes. |

`@name` |
Name of the result. |

`@number_distrib` |
Number of groups of variables. |

`@pi` |
Vector of length kr. Row mixing proportions. |

`@rho` |
List of length D. The d^th item represents the column mixing proportion for the d^th group of variables. |

`@dlist` |
List of length d. The d^th item represents the indexes of group of variables d. |

`@kc` |
Vector of length D. The d^th element represents the number of clusters column H for the d^th group of variables. |

`@m` |
Vector of length D. The d^th element represents the number of levels of the d^th group of variables. |

`@nbSEM` |
Number of SEM-Gibbs algorithm iteration. |

`@params` |
List of length D. The d^th item represents the blocks parameters for a group of variables d. |

`@xhat` |
List of length D. The d^th item represents the dataset of the d^th group of variables, with missing values completed. |

Margot Selosse, Julien Jacques, Christophe Biernacki.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | ```
# loading the real dataset
data("dataqol.classif")
set.seed(5)
# loading the ordinal data
M <- as.matrix(dataqol.classif[,2:29])
# creating the classes values
y <- as.vector(dataqol.classif$death)
# sampling datasets for training and to predict
nb.sample <- ceiling(nrow(M)*2/3)
sample.train <- sample(1:nrow(M), nb.sample, replace=FALSE)
M.train <- M[sample.train,]
M.validation <- M[-sample.train,]
nb.missing.validation <- length(which(M.validation==0))
m <- c(4)
M.validation[which(M.validation==0)] <- sample(1:m, nb.missing.validation,replace=TRUE)
y.train <- y[sample.train]
y.validation <- y[-sample.train]
# configuration for SEM algorithm
nbSEM=50
nbSEMburn=40
nbindmini=1
init="kmeans"
# number of classes to predict
kr <- 2
# different kc to test with cross-validation
kcol <- 1
res <- bosclassif(x=M.train,y=y.train,kr=kr,kc=kcol,m=m,
nbSEM=nbSEM,nbSEMburn=nbSEMburn,
nbindmini=nbindmini,init=init)
predictions <- predict(res, M.validation)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.