read.dsm.ucs | R Documentation |
This function loads raw DSM data – a cooccurrence frequency matrix and tables of marginal frequencies – in UCS export format. The data are read from a directory containing several text files with predefined names, which can optionally be compressed (see ‘File Format’ below for details).
read.dsm.ucs(filename, encoding = getOption("encoding"), verbose = FALSE)
filename |
the name of a directory containing files with the raw DSM data. |
encoding |
character encoding of the input files, which will automatically be converted to R's internal representation if possible. See ‘Encoding’ in |
verbose |
if |
An object of class dsm
containing a dense or sparse DSM.
Note that the information tables for target terms (field rows
) and feature terms (field cols
) include the correct marginal frequencies from the UCS export files. Nonzero counts for rows are and columns are added automatically unless they are already present in the disk files. Additional fields from the information tables as well as all global variables are preserved with their original names.
The UCS export format is a directory containing the following files with the specified names:
‘M’ or ‘M.mtx’
cooccurrence matrix (dense, plain text) or sparse matrix (MatrixMarket format)
‘rows.tbl’
row information (labels term
, marginal frequencies f
)
‘cols.tbl’
column information (labels term
, marginal frequencies f
)
‘globals.tbl’
table with single row containing global variables; must include variable N
specifying sample size
Each individual file may be compressed with an additional filename extension .gz
, .bz2
or .xz
; read.dsm.ucs
automatically decompresses such files when loading them.
Stephanie Evert (https://purl.org/stephanie.evert)
The UCS toolkit is a software package for collecting and manipulating co-occurrence data available from http://www.collocations.de/software.html.
UCS relies on compressed text files as its main storage format. They can be exported as a DSM with ucs-tool export-dsm-matrix
.
dsm
, read.dsm.triplet
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.