NominalLogisticBiplot: Nominal Logistic Biplot for polytomous data

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/NominalLogisticBiplot.R

Description

Function that calculates the parameters of the Nominal Logistic Biplot according to Hernandez-Sanchez & Vicente-Villardon (2013).

Usage

1
2
3
4
NominalLogisticBiplot(datanom, sFormula = NULL, numFactors = 2,
method = "EM", rotation = "varimax", metfsco = "EAP",
nnodos = 10, tol = 1e-04, maxiter = 100, penalization = 0.1,
cte = TRUE,initial=1,alfa=1, show = FALSE)

Arguments

datanom

The data set, it can be a matrix with integers or a data frame with factors. All variables have to be nominal.

sFormula

This parameter follows the unifying interface for selecting variables from a data frame for a plot, test or model. The most common formula es of type y ~ x1+x2+x3. It has a default value of NULL if not specified.

numFactors

Number of dimensions of the solution. It should be lower than the number of variables. It has a default value of 2.

method

This parameter can be: "EM", "ACM", "MIRT" or "PCOA". Method to compute the row coordinates.

rotation

Rotation method to used with "MIRT" option in "coordinates". No effect fror other options.

metfsco

Calculation method for the fscores with "MIRT" option in "coordinates". No effect fror other options.

nnodos

Number of nodes for gauss quadrature in the EM algorithm.

tol

Tolerance for the EM algorithm.

maxiter

Maximum number of iterations in the EM algorithm.

penalization

Penalization for the ridge regression for each variable.

cte

Include constant in the logistic regression model. Default is TRUE.

initial

Value to decide the method(1-Correspondence analysis, 2-Mirt) that calculates the initial abilities values for the individuals.

alfa

If initial parameter method is correspondence analysis, this parameter determines the weight for rows and columns.

show

Show intermediate copmputations. Default is TRUE.

Details

The general algorithm used is essentially an alternated procedure in which parameters for rows and columns are computed in alternated steps repeated until convergence. Parameters for the rows are calculated by expectation (E-step) or by a external procedure (Multiple Correspondence Analysis or Principal Coordinates Analysis) and parameters for the columns are computed by maximization (M-step), i. e., by Nominal Logistic Regression. When the procedure for Row scores is external, only one iteration is performed and the procedure is called "External Nominal Logistic Biplot". Because the aim of the biplot is the representation

There are several options for the computation:

1.- Using the package mirt to obtain the row scores, i. e. using a solution obtained from a latent trait model. The column (item) parameters should be directly used by our biplot procedure but, because of the characteristics of the package that performs a default rotation after parameter estimation, we have to reestimate the item parametes to be coherent to the scores.

2.- Using our implementation of the EM algorithm alternating expected a porteriori scores and Ridge Nominal Logistic Regression for each variable.

3.- Using external coordinates for the rows taken from Multiple Correspondence Analysis or Principal Coordinates Analysis and fitting the response surfaces in just one step.

Equations that define a set of probability response surfaces (one for each category and each variable) are no longer sigmoid as in the binary case (Vicente-Villardon et al. (2006)). This means that the level curves are no longer straight lines and then, prediction of probabilities is not made by projection as in the usual linear biplots. For each variable, define a set of convex polygons that can be interpreted as "prediction" regions in the same way as in Gower & Hand (1996). Each pair of response surfaces defined by intersect in a straight line that, projected onto the space of predictors, is the set of points in which the probability of both categories is the same. Those lines are the candidates to be the edges of the convex polygons defining the prediction regions.

Value

An object of class "nominal.logistic.biplot". This has some components:

dataSet

Data set of study with all the information about the name of the levels and names of the variables and individuals

RowCoords

Coordinates for the individuals in the reduced space

VariableModels

Information of the regression resuls for each variable.

NumFactors

Number of dimensions selected for the study

Method

Method for calculating the row positions

Rotation

Type of rotation if we have chosen mirt coordinates

Methodfscores

Method of calculation of the fscores in mirt process

NumNodos

Number of nodes for the gauss quadrature in EM algorith

tol

Cut point to stop the EM-algorith

maxiter

Maximum number of iterations in the EM-algorith

penalization

Value for the correction of the ridge regression

cte

Boolean value to choose if the model for each variable will have independent term

show

Boolean value to indicate if we want to see the results of our analysis

Author(s)

Julio Cesar Hernandez Sanchez, Jose Luis Vicente-Villardon

Maintainer: Julio Cesar Hernandez Sanchez <juliocesar_avila@usal.es>

References

Hernandez, J. C. & Vicente-Villardon, J. L. (2013) Logistic Biplots for Nominal Data. Submitted. Preprint available at : https://www.researchgate.net/publication/256428288_Logistic_Biplot_for_Nominal_Data?ev=prf_pub

Vicente-Villardon, J., Galindo, M.P & Blazquez-Zaballos, A. (2006), Logistic biplots,Multiple Correspondence Analysis and related methods pp. 491–509.

Demey, J., Vicente-Villardon, J. L., Galindo, M.P. & Zambrano, A. (2008) Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics, 24(24), 2832-2838.

Baker, F.B. (1992): Item Response Theory. Parameter Estimation Techniques. Marcel Dekker. New York.

Gabriel, K. (1971), The biplot graphic display of matrices with application to principal component analysis., Biometrika 58(3), 453–467.

Gabriel, K. R. (1998), Generalised bilinear regression, Biometrika 85(3), 689–700.

Gabriel, K. R. & Zamir, S. (1979), Lower rank approximation of matrices by least squares with any choice of weights, Technometrics 21(4), 489–498.

Gower, J. & Hand, D. (1996), Biplots, Monographs on statistics and applied probability. 54. London: Chapman and Hall., 277 pp.

Chalmers,R,P (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29. URL http://www.jstatsoft.org/v48/i06/.

See Also

NominalLogBiplotEM, afc, PCoA

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
data(HairColor)
nlbo = NominalLogisticBiplot(HairColor,sFormula=NULL,
numFactors=2,method="EM",penalization=0.2,show=FALSE)
nlbo

#data(PhD_nomCyL)
#cyl = NominalLogisticBiplot(PhD_nomCyL,sFormula=NULL,
#numFactors=2,method="EM",initial = 1,penalization=0.3,show=FALSE)
#summary(nlboPhD)
#plot(nlboPhD,QuitNotPredicted=TRUE,ReestimateInFocusPlane=TRUE,
#      planex = 1,planey = 2,proofMode=TRUE,LabelInd=FALSE,AtLeastR2 = 0.01,
#      xlimi=-1.5,xlimu=1.5,ylimi=-1.5,ylimu=1.5,linesVoronoi = TRUE,SmartLabels = FALSE,
#      PlotInd=TRUE,
#      CexInd = c(0.4),                                                           
#      PchInd = c(1),
#      ColorInd="azure3",
#      PlotVars=TRUE,LabelVar = TRUE,
#      PchVar = c(1,2,3,4,5,6,7,8,9),ColorVar = c("red","black","maroon","blue","green",
#      "chocolate4","coral3","brown","brown2"),
#      ShowResults=TRUE)

NominalLogisticBiplot documentation built on May 2, 2019, 6:03 a.m.