| svs-package | R Documentation |
This package offers various tools for semantic vector spaces. There are techniques for correspondence analysis (simple, multiple and discriminant), latent semantic analysis, probabilistic latent semantic analysis, non-negative matrix factorization, latent class analysis, EM clustering, logratio analysis and log-multiplicative (association) analysis. Furthermore, the package has specialized distance measures and plotting functions as well as some helper functions.
This package contains the following raw data files (in the folder extdata):
SndT_Fra.txtSeventeen Dutch source words and their French translations.
SndT_Eng.txtSeventeen Dutch source words and their English translations.
InvT_Fra.txtSeventeen Dutch target words and their French source words.
InvT_Eng.txtSeventeen Dutch target words and their English source words.
Ctxt_Dut.txtContext words for seventeen Dutch words.
Ctxt_Fra.txtContext words for seventeen Dutch words translated from French.
Ctxt_Eng.txtContext words for seventeen Dutch words translated from English.
The (fast procedures for the) techniques in this package are:
fast_scaSimple correspondence analysis.
fast_mcaMultiple correspondence analysis.
fast_dcaDiscriminant correspondence analysis.
fast_lsaLatent semantic analysis.
fast_psaProbabilistic latent semantic analysis.
fast_nmfNon-negative matrix factorization.
fast_lcaLatent class analysis.
fast_E_MEM clustering.
fast_lraLogratio analysis.
fast_lmaLog-multiplicative (association) analysis.
The complete overview of local and global weighting functions in this package can be found on weighting_functions.
The specialized distance measures are:
dist_chisquareChi-square distance.
dist_cosineCosine distance.
dist_wrtDistance with respect to a certain point.
dist_wrt_centersDistance with respect to cluster centers.
The specialized plotting functions are:
cd_plotCumulative distribution plot.
pc_plotParallel coordinate plot.
There are two helper functions for correspondence analysis:
freq_caCompute level frequencies (for a factor).
centers_caCompute coordinates for cluster centers.
There is one helper function for pvclust:
complete_pvpickComplete the output of pvpick.
There is one helper function for igraph:
layout4bipartiteCreate a layout matrix for a bipartite graph.
The remaining helper functions in this package are:
rep4datRepeat the rows of a data frame according to a frequency column.
vec2ddcTransform a vector into a double-coded matrix.
dat2ddcTransform a data frame into a double-coded matrix.
vec2indTransform a vector into an indicator matrix.
tab2datTransform a table into a data frame.
tab2indTransform a table into an indicator matrix.
dat2indTransform a data frame into an indicator matrix.
outerecRecursive application of the outer product.
pmiPointwise mutual information.
MIMutual information.
log_or_0Logarithmic transform.
Many packages contain correspondence analysis: ca, FactoMineR, MASS and others.
For latent semantic analysis there is also the package lsa.
The package NMF provides more flexibility for non-negative matrix factorization.
For topic models there are the packages lda and topicmodels.
Latent class analysis can also be run in the package poLCA.
For log-ratio analysis there is also the package easyCODA.
The package gnm offers much flexibility for association analysis, i.e. fitting log-multiplicative or Goodman's RC models.
As from 2023, this package is part of Module 10: Multivariate data analysis with R of the Summer School Methods in Language Sciences.
Koen Plevoets, koen.plevoets@ugent.be
This package has benefited greatly from the helpful comments of Lore Vandevoorde, Pauline De Baets and Gert De Sutter. Thanks to Kurt Hornik, Uwe Ligges and Brian Ripley for their valuable recommendations when proofing this package.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.