rank_freq_plot: Rank Frequency Plot

View source: R/rank_freq_plot.R

rank_freq_mplotR Documentation

Rank Frequency Plot

Description

rank_freq_mplot - Plot a faceted word rank versus frequencies by grouping variable(s).

rank_freq_plot - Plot word rank versus frequencies.

Usage

rank_freq_mplot(
  text.var,
  grouping.var = NULL,
  ncol = 4,
  jitter = 0.2,
  log.freq = TRUE,
  log.rank = TRUE,
  hap.col = "red",
  dis.col = "blue",
  alpha = 1,
  shape = 1,
  title = "Rank-Frequency Plot",
  digits = 2,
  plot = TRUE
)

rank_freq_plot(
  words,
  frequencies,
  plot = TRUE,
  title.ext = NULL,
  jitter.ammount = 0.1,
  log.scale = TRUE,
  hap.col = "red",
  dis.col = "blue"
)

Arguments

text.var

The text variable.

grouping.var

The grouping variables. Default NULL generates one word list for all text. Also takes a single grouping variable or a list of 1 or more grouping variables.

ncol

integer value indicating the number of columns in the facet wrap.

jitter

Amount of horizontal jitter to add to the points.

log.freq

logical. If TRUE plots the frequencies in the natural log scale.

log.rank

logical. If TRUE plots the ranks in the natural log scale.

hap.col

Color of the hapax legomenon points.

dis.col

Color of the dis legomenon points.

alpha

Transparency level of points (ranges between 0 and 1).

shape

An integer specifying the symbol used to plot the points.

title

Optional plot title.

digits

Integer; number of decimal places to round.

plot

logical. If TRUE provides a rank frequency plot.

words

A vector of words.

frequencies

A vector of frequencies corresponding to the words argument.

title.ext

The title extension that extends: "Rank-Frequency Plot ..."

jitter.ammount

Amount of horizontal jitter to add to the points.

log.scale

logical. If TRUE plots the rank and frequency as a log scale.

Value

Returns a rank-frequency plot and a list of three dataframes:

WORD_COUNTS

The word frequencies supplied to rank_freq_plot or created by rank_freq_mplot.

RANK_AND_FREQUENCY_STATS

A dataframe of rank and frequencies for the words used in the text.

LEGOMENA_STATS

A dataframe displaying the percent hapax legomena and percent dis legomena of the text.

Note

rank_freq_mplot utilizes the ggplot2 package, whereas, rank_freq_plot employs base graphics. rank_freq_mplot is more general & flexible; in most cases rank_freq_mplot should be preferred.

References

Zipf, G. K. (1949). Human behavior and the principle of least effort. Cambridge, Massachusetts: Addison-Wesley. p. 1.

Examples

## Not run: 
#rank_freq_mplot EXAMPLES:
x1 <- rank_freq_mplot(DATA$state, DATA$person, ncol = 2, jitter = 0)
ltruncdf(x1, 10)
x2 <- rank_freq_mplot(mraja1spl$dialogue, mraja1spl$person, ncol = 5, 
    hap.col = "purple")
ltruncdf(x2, 10)
invisible(rank_freq_mplot(mraja1spl$dialogue, mraja1spl$person, ncol = 5, 
    log.freq = FALSE, log.rank = FALSE, jitter = .6))
invisible(rank_freq_mplot(raj$dialogue, jitter = .5, alpha = 1/15))
invisible(rank_freq_mplot(raj$dialogue, jitter = .5, shape = 19, alpha = 1/15))

#rank_freq_plot EXAMPLES:
mod <- with(mraja1spl , word_list(dialogue, person, cut.n = 10, 
    cap.list=unique(mraja1spl$person)))         
x3 <- rank_freq_plot(mod$fwl$Romeo$WORD, mod$fwl$Romeo$FREQ, title.ext = 'Romeo')  
ltruncdf(x3, 10)
ltruncdf(rank_freq_plot(mod$fwl$Romeo$WORD, mod$fwl$Romeo$FREQ, plot = FALSE)           , 10)
invisible(rank_freq_plot(mod$fwl$Romeo$WORD, mod$fwl$Romeo$FREQ, title.ext = 'Romeo',     
    jitter.ammount = 0.15, hap.col = "darkgreen", dis.col = "purple"))                  
invisible(rank_freq_plot(mod$fwl$Romeo$WORD, mod$fwl$Romeo$FREQ, title.ext = 'Romeo',  
    jitter.ammount = 0.5, log.scale=FALSE))  
invisible(lapply(seq_along(mod$fwl), function(i){
    dev.new()
    rank_freq_plot(mod$fwl[[i]]$WORD, mod$fwl[[i]]$FREQ, 
        title.ext = names(mod$fwl)[i], jitter.ammount = 0.5, log.scale=FALSE)
}))

## End(Not run)

qdap documentation built on May 31, 2023, 5:20 p.m.