Description Usage Arguments Details Value Examples
Function that calculates a matrix of Euclidean distances between each pair of instances from two datasets.
1 | getEDmatrix(set1, set2)
|
set1 |
a data frame containing only the molecular features meant for the calculation of the Euclidean Distance |
set2 |
a data frame containing only the molecular features meant for the calculation of the Euclidean Distance |
No NA
values are accepted, so either the respective instance is previously removed or empty values should be replaced
(eg., with the respective column median or average). For the purpose of using this package, set1
and set2
should be the same dataset.
All columns present at the data frames are used in the calculation of the Euclidean distances, i.e. Euclidean distance between set1[rowi] and set2[rowj].
Prior to calculating the Euclidean distances, the datasets will be scaled using scale
, which applies \frac{x_{ij}-min_j}{max_j-min_j}
to each instance x_i under column (feature) j.
a getEDmatrix
object which consists of set1
vs set2
data frame of Euclidean distances, where the values in each row are sorted in
ascending order. As a consequence columns have no meaning on their own. getEDmatrix
also implicitly creates two
variables, maxs
and mins
, which are automatically saved under such names and do not need explicitly variable
assignment. They are created for later data scaling.
1 2 | train <- matrix(1:9,nrow=3, ncol=3)
a <- getEDmatrix(train, train)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.