featureSelection: featureSelection function calculates the optimal order of...
In KnowSeq: KnowSeq R/Bioc package: The Smart Transcriptomic Pipeline

Description Usage Arguments Value Examples

featureSelection function calculates the optimal order of DEGs to achieve the best result in the posterior machine learning process by using mRMR algorithm or Random Forest. Furthermore, the ranking is returned and can be used as input of the parameter vars_selected in the machine learning functions.

featureSelection(
  data,
  labels,
  vars_selected,
  mode = "mrmr",
  disease = "",
  subdiseases = c(),
  maxGenes = ncol(data)
)

`data`	The data parameter is an expression matrix or data.frame that contains the genes in the columns and the samples in the rows.
`labels`	A vector or factor that contains the labels for each samples in data parameter.
`vars_selected`	The genes selected to use in the feature selection process. It can be the final DEGs extracted with the function `DEGsExtraction` or a custom vector of genes.
`mode`	The algorithm used to calculate the genes ranking. The possibilities are three: mrmr, rf and da.
`disease`	The name of a disease in order to calculate the Disease Association ranking by using the DEGs indicated in the vars_selected parameter.
`subdiseases`	Vector with the name of the particular subdiseases from disease in order to calculate the Disease Association ranking by using the DEGs indicated in the vars_selected parameter.
`maxGenes`	Integer that indicated the maximum number of genes to be returned.

A vector that contains the ranking of genes.

1
2
3

dir <- system.file("extdata", package="KnowSeq")
load(paste(dir,"/expressionExample.RData",sep = ""))
featureRanking <- featureSelection(t(DEGsMatrix),labels,rownames(DEGsMatrix),mode='mrmr')