| tspTransform | R Documentation |
Applies a top-scoring pairs transformation, that is creates m\cdot (m-1)/2 logical features, for each two-element subset of original features,
composed of TRUE when the value of the first is larger or equal than in the second and FALSE otherwise (first and second here is according to the order of features in input).
tspTransform(x, sep = "__", sample, check.names = FALSE)
x |
Data.frame to be converted; has to be composed of at least two features of a single type (to be comparable). |
sep |
Separator string used to join original feature names to generate names for transformed features.
Can be set to |
sample |
A number of features to generate.
If set, the function generates only a random subset out of all possible |
check.names |
Passed to the underlying call to |
This transformation can be used to recreate top-scoring pairs methods using information theory concepts, for instance using MIM.
The main gain form TSP is that it is resilient to calibration errors, in particular some sample batch biases, it also generates a robust and parameter-less discrete representation of the continuous input.
It is lossy, however, and the generated scores for feature pairs may be hard for interpretation; the inflation of feature count can also pose practical problems, which is a reason why this function offers a way to efficiently and randomly under-sample the output.
For TSP to work well, it is crucial that input features have approximately identical distribution, so that the output features would have enough entropy to be informative given some decision or when compared with each other; to this end, re-scaling may be required, for instance with scale.
A logical data.frame.
NAs are accepted and treated as incomparable values.
tspTransform(data.frame(a=1:3,b=1:3,c=rep(2,3)),sep='>=')
#Convering iris data
tspIris<-tspTransform(data.frame(scale(iris[,-5])))
#Feature selection
MIM(tspIris,iris$Species)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.