Distance matrix

Share:

Description

The distancematrix function is used to reformat the input distance matrix into the format required by the nonbipartite matching Fortran code. The original matrix should have dimensions NxN, where N is the total number of elements to be matched. The matrix may be created in R and input into the distancematrix function. Alternately, the matrix may be read in from a CSV file, i.e. a text file where distances in a given row are delimited by commas. If a list element is given, it should have a data.frame element named "dist", preferably generated by the gendistance function.

Usage

1

Arguments

x

A matrix, data.frame, list or filename. This should be an NxN distance matrix for the N elements to be matched. The values in the diagonal are ignored because an element cannot be matched to itself. Using zeros in the diagonal is preferable, although other values are acceptable provided they are not so large that they distort the scaling of the other values.

...

Additional arguments, potentially used when reading in a filename and passed into read.csv.

Details

  • The distancematrix function is used to reformat the input distance matrix into the format required by the nonbipartite matching Fortran code.

  • If an extra column or row is present, it will be converted into row names. In other words, if the matrix has dimensions (N+1)xN, or Nx(N+1), then the function will take the first row, or column, as an ID column. If both row and column names are present, i.e. a (N+1)x(N+1) matrix, the function cannot identify the names.

  • If an odd number of elements exist, a ghost element, or sink, will be created whose distance is zero to all of the other elements. For example, when matching 17 elements, the function will create an 18th element that matches every element perfectly. This sink may or not be appropriate for your application. Naturally, you may create sinks as needed in the distance matrix you input to the distancematrix function.

  • The elements of distancematrix may not be re-assigned once created. In other words, you cannot edit the formatted distance matrix. You need to edit the matrix being input into the distancematrix function.

Value

distancematrix S4 object

Author(s)

Cole Beck

See Also

nonbimatch gendistance

Examples

1
2
3
4
5
6
7
8
9
plainmatrix<-as.matrix(dist(sample(1:25, 8, replace=TRUE)))
diag(plainmatrix) <- 99999  # setting diagonal to an infinite distance for
                           # pedagogical reasons (the diagonal may be left
                           # as zero)
mdm<-distancematrix(plainmatrix)
df <- data.frame(id=LETTERS[1:25], val1=rnorm(25), val2=rnorm(25))
df[sample(seq_len(nrow(df)), ceiling(nrow(df)*0.1)), 2] <- NA
df.dist <- gendistance(df, idcol=1, ndiscard=2)
mdm2 <- distancematrix(df.dist)