Functions to create functions that filter potential predictive features using statistics that do not access class labels.

1 2 3 4 5 6 7 | ```
filterMean(cutoff)
filterMedian(cutoff)
filterSD(cutoff)
filterMin(cutoff)
filterMax(cutoff)
filterRange(cutoff)
filterIQR(cutoff)
``` |

`cutoff` |
A real number, the level above which features with this statistic should be retained and below which should be discarded. |

Following the usual conventions introduced from the world of
gene expression microarrays, a typical data matrix is constructed from
columns reporesenting samples on which we want to make predictions
amd rows representing the features used to construct the predictive
model. In this context, we define a *filter* to be a function
that accepts a data matrix as its only argument and returns a logical
vector, whose length equals the number of rows in the matrix, where
'TRUE' indicates features that should be retrained. Most filtering
functions belong to parametrized families, with one of the most common
examples being
"retain all features whose mean is above some pre-specified cutoff".
We implement this idea using a set of function-generating functions,
whose arguments are the parameters that pick out the desired member
of the family. The return value is an instantiation of a particular
filtering function. The decison to define things this way is to be
able to apply the methods in cross-validaiton (or other) loops where
we want to ensure that we use the same filtering rule each time.

Each of the seven functions described here return a filter function,
`f`

, that can be used by `logicalVector <- filter(data)`

.

Kevin R. Coombes <krc@silicovore.com>

See `Modeler-class`

and `Modeler`

for details
about how to perform cross-validation.

1 2 3 4 5 6 7 |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.