# Partial Least Squares Discriminant Analysis for IsomirDataSeq

### Description

Use PLS-DA method with the normalized count data to detect the most important features (miRNAs/isomiRs) that explain better the group of samples given by the experimental design. It is a supervised clustering method with permutations to calculate the significance of the analysis.

### Usage

1 2 |

### Arguments

`ids` |
object of class |

`group` |
column name in |

`validation` |
type of validation, either NULL or "learntest". Default NULL |

`learn` |
optional vector of indexes for a learn-set. Only used when validation="learntest". Default NULL |

`test` |
optional vector of indices for a test-set. Only used when validation="learntest". Default NULL |

`tol` |
tolerance value based on maximum change of cumulative R-squared coefficient for each additional PLS component. Default tol=0.001 |

`nperm` |
number of permutations to compute the PLD-DA p-value based on R2 magnitude. Default nperm=400 |

`refinment` |
logical indicating whether a refined model, based on filtering out variables with low VIP values |

`vip` |
Variance Importance in Projection threshold value when a refinement process is considered. Default vip=1.2 |

### Details

Partial Least Squares Discriminant Analysis (PLS-DA) is a technique specifically appropriate for analysis of high dimensionality data sets and multicollinearity (Perez-Enciso, 2013). PLS-DA is a supervised method (i.e. makes use of class labels) with the aim to provide a dimension reduction strategy in a situation where we want to relate a binary response variable (in our case young or old status) to a set of predictor variables. Dimensionality reduction procedure is based on orthogonal transformations of the original variables (miRNAs/isomiRs) into a set of linearly uncorrelated latent variables (usually termed as components) such that maximizes the separation between the different classes in the first few components (Xia, 2011). We used sum of squares captured by the model (R2) as a goodness of fit measure.

We implemented this method using the
`DiscriMiner-package`

into `isoPLSDA`

function.
The output
p-value of this function will tell about the statistical
significant of the group separation using miRNA/isomiR expression data.

Read more about the parameters related to the PLS-DA directly from plsDA function.

### Value

A `list`

with the following elements: `R2Matrix`

(R-squared coefficients of the PLS model),
`components`

(of the PLS, similar to PCs in a PCA),
`vip`

(most important isomiRs/miRNAs),
`group`

(classification of the samples),
`p.value`

and `R2PermutationVector`

obtained by the permutations.

If the option `refinment`

is set to `TRUE`

, then the following
elements will appear:
`R2RefinedMatrix`

and `componentsRefinedModel`

(R-squared coefficients
of the PLS model only using the most important miRNAs/isomiRs). As well,
`p.valRefined`

and `R2RefinedPermutationVector`

with p-value
and R2 of the
permutations where samples were randomized. And finally,
`p.valRefinedFixed`

and `R2RefinedFixedPermutationVector`

with
p-value and R2 of the
permutations where miRNAs/isomiRs were randomized.

### References

Perez-Enciso, Miguel and Tenenhaus, Michel. Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach. Human Genetics. 2003.

Xia, Jianguo and Wishart, David S. Web-based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst. Nature Protocols. 2011.

### Examples

1 2 3 4 5 6 7 |