plot_wordlist: 2D- or 3D-Plot of a list of words

View source: R/plot_wordlist.r

plot_wordlistR Documentation

2D- or 3D-Plot of a list of words

Description

2D or 3D-Plot of mutual word similarities to a given list of words

Usage

plot_wordlist(x,connect.lines="all",method="PCA",dims=3,
   axes=F,box=F,cex=1,legend=T, size = c(800,800),
   alpha="graded",alpha.grade=1,col="rainbow",
   tvectors=tvectors,...)

Arguments

x

a character vector of length(x) > 1 that contains multiple sentences/documents

dims

the dimensionality of the plot; set either dims = 2 or dims = 3

method

the method to be applied; either a Principal Component Analysis (method="PCA") or a Multidimensional Scaling (method="MDS")

connect.lines

(3d plot only) the number of closest associate words each word is connected with via line. Setting connect.lines="all" (default) will draw all connecting lines and will automatically apply alpha="graded".

axes

(3d plot only) whether axes shall be included in the plot

box

(3d plot only) whether a box shall be drawn around the plot

cex

(2d Plot only) A numerical value giving the amount by which plotting text should be magnified relative to the default.

legend

(3d plot only) whether a legend shall be drawn illustrating the color scheme of the connect.lines. The legend is inserted as a background bitmap to the plot using bgplot3d. Therefore, they do not resize very gracefully (see the bgplot3d documentation for more information).

size

(3d plot only) A numeric vector with two elements, the first specifying the width and the second specifying the height of the plot device.

tvectors

the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector)

alpha

(3d plot only) A numeric vector specifying the luminance of the connect.lines. By setting alpha="graded", the luminance of every line will be adjusted to the cosine between the two words it connects.

alpha.grade

(3d plot only) Only relevant if alpha="graded". Specify a numeric value for alpha.grade to scale the luminance of all connect.lines up (alpha.grade > 1) or down (alpha.grade < 1) by that factor.

col

(3d plot only) A vector specifying the color of the connect.lines. With setting col ="rainbow" (default), the color of every line will be adjusted to the cosine between the two words it connects, according to the rainbow palette. Other available color palettes for this purpose are heat.colors, terrain.colors, topo.colors, and cm.colors (see rainbow). Additionally, you can customize any color scale of your choice by providing an input specifying more than one color (for example col = c("black","blue","red")).

...

additional arguments which will be passed to plot3d (in a three-dimensional plot only)

Details

Computes all pairwise similarities within a given list of words. On this similarity matrix, a Principal Component Analysis (PCA) or a Multidimensional Sclaing (MDS) is applied to get a two- or three-dimensional solution that best captures the similarity structure. This solution is then plotted.

For creating pretty plots showing the similarity structure within this list of words best, set connect.lines="all" and col="rainbow"

Value

see plot3d: this function is called for the side effect of drawing the plot; a vector of object IDs is returned.

plot_wordlist also gives the coordinate vectors of the words in the plot as a data frame

Author(s)

Fritz Guenther, Taylor Fedechko

References

Landauer, T.K., & Dumais, S.T. (1997). A solution to Plato's problem: The Latent Semantic Analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104, 211-240.

Mardia, K.V., Kent, J.T., & Bibby, J.M. (1979). Multivariate Analysis, London: Academic Press.

See Also

cosine, neighbors, multicos, plot_neighbors, plot3d, princomp, rainbow

Examples

data(wonderland)

## Standard Plot

words <- c("alice","hatter","queen","knight","hare","cheshire") 
            
plot_wordlist(words,tvectors=wonderland,method="MDS",dims=2)


LSAfun documentation built on Nov. 18, 2023, 1:10 a.m.