pcaQDA: Quadratic Discriminant Analysis (QDA) using Principal...

pcaQDAR Documentation

Quadratic Discriminant Analysis (QDA) using Principal Component Analysis (PCA)

Description

The principal components (PCs) for predictor variables provided as input data are estimated and then the individual coordinates in the selected PCs are used as predictors in the qda

Predict using a PCA-LDA model built with function 'pcaLDA'

Usage

pcaQDA(
  formula = NULL,
  data = NULL,
  grouping = NULL,
  n.pc = 1,
  retx = TRUE,
  scale = FALSE,
  center = FALSE,
  tol = 1e-04,
  method = "moment",
  ...
)

predict.pcaQDA(
  object,
  newdata = NULL,
  type = c("qda.pred", "qd", "class", "posterior", "pca.ind.coord", "all"),
  ...
)

Arguments

formula

Same as in qda from package 'MASS'.

data

Same as in qda from package 'MASS'.

grouping

Same as in qda from package 'MASS'.

n.pc

Number of principal components to use in the qda.

retx

A logical value indicating whether the rotated variables should be returned.

scale

Same as in prcomp from package 'stats'.

center

Same as in prcomp from package 'stats'.

tol

Same as in prcomp from package 'stats'.

method

Same as in qda from package 'MASS'.

...

Further parameters to pass to qda.

object

To use with function 'predict'. A 'pcaQDA' object containing a list of two objects: 1) an object of class inheriting from 'qda' and 2) an object of class inheriting from 'prcomp'.

newdata

To use with function 'predict'. New data for classification prediction.

type

To use with function 'predict'. The type of prediction required:

  • "qda.pred": Return the object given by the object given by predict.qda plus pca individual coordinates.

  • "qd": A data frame carrying the quadratic discriminant scores for each individual plus the individual classifications.

  • "class": Individual classifications.

  • "posterior": Posterior classsification probabilities for each individual.

  • "pca.ind.coord": Only the pca individual coordinates.

  • "all": A list carrying: "qd", "posterior", and "pca.ind.coord".

Details

The principal components (PCs) are obtained using the function 'prcomp' from R package 'stats', while the qda is performed using the 'qda' function from R package 'MASS'. The current application only uses basic functionalities of mentioned functions. As shown in the example, 'pcaQDA' function can be used in general classification problems.

Value

Function 'pcaQDA' returns an object ('pcaQDA') consisting of a list with two objects:

  1. 'qda': an object of class qda from package 'MASS'.

  2. 'pca': an object of class prcomp from package 'stats'.

For information on how to use these objects see ?qda and ?prcomp.

See Also

pcaLDA, qda and predict.lda

Examples

## Generate training and testing sets
data("iris3", package = "datasets")
set.seed(1)
rs <- sample(1:50, 25)
train <- data.frame(rbind(iris3[rs,,1], iris3[rs,,2], iris3[rs,,3]))
test <- data.frame(rbind(iris3[-rs,,1], iris3[-rs,,2], iris3[-rs,,3]))
cl <- factor(c(rep("setosa",25), rep("versicolor",25), rep("virginica",25)))
train$species <- cl
test$species <- cl

## Applying PCA + QDA
model <- pcaQDA(formula = species ~., data = train, n.pc = 2, max.pc = 2,
                scale = TRUE, center = TRUE)

## To accomplish a predictions
pred_test <- predict(model, newdata = test, type = "all")
lapply(pred_test, head) ## The heads of the list elements


## Classification performance
require(caret)

conf.mat <- confusionMatrix(
    data = test$species,
    reference = factor(pred_test$qd$class))

conf.mat

## Graph of the individual quadratic-discriminant scores
require("ggplot2")

dt <- predict(model, newdata = test, type = "qd")

p0 <- theme(
    axis.text.x  = element_text( face = "bold", size = 18, color="black",
                                 # hjust = 0.5, vjust = 0.5,
                                 family = "serif", angle = 0,
                                 margin = margin(1,0,1,0, unit = "pt" )),
    axis.text.y  = element_text( face = "bold", size = 18, color="black",
                                 family = "serif",
                                 margin = margin( 0,0.1,0,0, unit = "mm" )),
    axis.title.x = element_text(face = "bold", family = "serif", size = 18,
                                color="black", vjust = 0 ),
    axis.title.y = element_text(face = "bold", family = "serif", size = 18,
                                color="black",
                                margin = margin( 0,2,0,0, unit = "mm" ) ),
    legend.title=element_blank(),
    legend.text = element_text(size = 20, face = "bold", family = "serif"),
    legend.position = c(0.5, 0.83),

    panel.border = element_rect(fill=NA, colour = "black", linewidth=0.07),
    panel.grid.minor = element_line(color= "white", linewidth = 0.2),
    axis.ticks = element_line(linewidth = 0.1),
    axis.ticks.length = unit(0.5, "mm"),
    plot.margin = unit(c(1,1,2,1), "lines"))

ggplot(dt, aes(x = QD1, y = QD2, colour = class)) +
    geom_point(size = 3) +
    scale_color_manual(values = c("green4","blue","brown1")) +
    stat_ellipse(aes(x = QD1, y = QD2, fill = class), data = dt,
                 type = "norm", geom = "polygon", level = 0.5,
                 alpha=0.2, show.legend = FALSE) +
    scale_fill_manual(values = c("green4","blue","brown1")) + p0

genomaths/MethylIT documentation built on Feb. 3, 2024, 1:24 a.m.