This function implements a procedure in order to rank features by their importance evaluated by RReliefF score.
A transposed data frame of expression data, i.e. transformed counts by vst or rlog. A log2 transformed expression matrix is also accepted. Rows and Cols should be, respectively, observations and features
A data frame with class and known variables; at least one column with 'class' label must be included
Fraction of sample to be used for the implementation of RReliefF algorithm; default is 1
This function is very time-consuming when the number of features is high. We observed there is a quadratic relationship between execution time and the number of features. Thus, we have also implemented a formula which allows the users to estimate the time to perform this step, given the number of features. The formula is:
T = 0.0011 * N^2 - 0.1822 * N + 27.092
where T = Time and N = Number of genes. We strongly suggest to filter out non informative features before performing this step.
A data frame with two culmuns, where features are sorted by importance scores:
RReliefF score - Calculated by
scaled.RReliefF score - Z-score value, computed for each RReliefF score.
A plot with the first 50 features ordered by their importance.
Mattia Chiesa, Luca Piacentini
Marko Robnik-Sikonja, Igor Kononenko: An adaptation of Relief for attribute estimation in regression. In: Fourteenth International Conference on Machine Learning, 296-304, 1997
1 2 3 4 5 6
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.