featureSelection: featureSelection function calculates the optimal order of...

View source: R/featureSelection.R

featureSelectionR Documentation

featureSelection function calculates the optimal order of DEGs to achieve the best result in the posterior machine learning process by using mRMR algorithm or Random Forest. Furthermore, the ranking is returned and can be used as input of the parameter vars_selected in the machine learning functions.

Description

featureSelection function calculates the optimal order of DEGs to achieve the best result in the posterior machine learning process by using mRMR algorithm or Random Forest. Furthermore, the ranking is returned and can be used as input of the parameter vars_selected in the machine learning functions.

Usage

featureSelection(
  data,
  labels,
  vars_selected,
  mode = "mrmr",
  disease = "",
  maxGenes = ncol(data)
)

Arguments

data

The data parameter is an expression matrix or data.frame that contains the genes in the columns and the samples in the rows.

labels

A vector or factor that contains the labels for each samples in data parameter.

vars_selected

The genes selected to use in the feature selection process. It can be the final DEGs extracted with the function DEGsExtraction or a custom vector of genes.

mode

The algorithm used to calculate the genes ranking. The possibilities are three: mrmr, rf and da.

disease

The name of a disease in order to calculate the Disease Association ranking by using the DEGs indicated in the vars_selected parameter.

maxGenes

Integer that indicated the maximum number of genes to be returned.

Value

A vector that contains the ranking of genes.

Examples

dir <- system.file("extdata", package="KnowSeq")
load(paste(dir,"/expressionExample.RData",sep = ""))
featureRanking <- featureSelection(t(DEGsMatrix),labels,rownames(DEGsMatrix),mode='mrmr')

CasedUgr/KnowSeq documentation built on Aug. 16, 2022, 6:19 a.m.