foreign | R Documentation |
Read and write CLUTO sparse matrix format files, or the CCS format variant employed by the MC toolkit.
read_stm_CLUTO(file)
write_stm_CLUTO(x, file)
read_stm_MC(file, scalingtype = NULL)
write_stm_MC(x, file)
file |
a character string with the name of the file to read or write. |
x |
a matrix object. |
scalingtype |
a character string specifying the type of scaling
to be used, or |
Documentation for CLUTO including its sparse matrix format used to be available from ‘https://www-users.cse.umn.edu/~karypis/cluto/’.
read_stm_CLUTO
reads CLUTO sparse matrices, returning a
simple triplet matrix.
write_stm_CLUTO
writes CLUTO sparse matrices.
Argument x
must be coercible to a simple triplet matrix via
as.simple_triplet_matrix
.
MC is a toolkit for creating vector models from text documents (see https://www.cs.utexas.edu/~dml/software/mc/). It employs a variant of Compressed Column Storage (CCS) sparse matrix format, writing data into several files with suitable names: e.g., a file with ‘_dim’ appended to the base file name stores the matrix dimensions. The non-zero entries are stored in a file the name of which indicates the scaling type used: e.g., ‘_tfx_nz’ indicates scaling by term frequency (‘t’), inverse document frequency (‘f’) and no normalization (‘x’). See ‘README’ in the MC sources for more information.
read_stm_MC
reads such sparse matrix information with argument
file
giving the path with the base file name, and returns a
simple triplet matrix.
write_stm_MC
writes matrices in MC CCS sparse matrix format.
Argument x
must be coercible to a simple triplet matrix via
as.simple_triplet_matrix
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.