Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/Feature_Selection.R
This function implements a procedure in order to rank features by their importance evaluated by RReliefF score.
1 | DaMiR.FSort(data, df, fSample = 1)
|
data |
A transposed data frame of expression data, i.e. transformed counts by vst or rlog. A log2 transformed expression matrix is also accepted. Rows and Cols should be, respectively, observations and features |
df |
A data frame with class and known variables; at least one column with 'class' label must be included |
fSample |
Fraction of sample to be used for the implementation of RReliefF algorithm; default is 1 |
This function is very time-consuming when the number of features is high. We observed there is a quadratic relationship between execution time and the number of features. Thus, we have also implemented a formula which allows the users to estimate the time to perform this step, given the number of features. The formula is:
T = 0.0011 * N^2 - 0.1822 * N + 27.092
where T = Time and N = Number of genes. We strongly suggest to filter out non informative features before performing this step.
A data frame with two culmuns, where features are sorted by importance scores:
RReliefF score - Calculated by relief
function,
implemented in
FSelector
package;
scaled.RReliefF score - Z-score value, computed for each RReliefF score.
A plot with the first 50 features ordered by their importance.
Mattia Chiesa, Luca Piacentini
Marko Robnik-Sikonja, Igor Kononenko: An adaptation of Relief for attribute estimation in regression. In: Fourteenth International Conference on Machine Learning, 296-304, 1997
relief
, DaMiR.FSelect
,
DaMiR.FReduct
1 2 3 4 5 6 | # use example data:
data(data_reduced)
data(df)
# rank features by importance:
df.importance <- DaMiR.FSort(data_reduced[,1:10],
df, fSample = 0.75)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.