zipf_plot | R Documentation |
This function creates a log-log plot to visualize Zipf's law, which states that the frequency of a word is inversely proportional to its rank in the frequency table. The plot compares the observed frequency distribution of elements with the expected distribution if Zipf's law were true.
zipf_plot(sequences_long)
sequences_long |
A data frame containing at least one column named ‘element' which represents the elements of sequences. Each element’s frequency is used to create the plot. |
- **Observed Frequencies**: Calculated from the provided 'sequences_long' data frame. - **Expected Frequencies**: Calculated using Zipf's law formula, where the frequency of the element is inversely proportional to its rank. - **Plotting**: Both observed and expected frequencies are plotted on a log-log scale to compare against Zipf's law.
A ‘ggplot' object that visualizes the observed and expected frequencies of elements according to Zipf’s law. The plot includes:
Rank |
The rank of each element based on its frequency, plotted on a log scale. |
Count |
The observed frequency of each element, plotted on a log scale. |
Expected |
The expected frequency of each element if Zipf's law were true, shown as a grey dashed line. |
# Example data frame
sequences_long <- data.frame(element = c('a', 'b', 'a', 'c', 'b', 'a', 'd', 'c', 'b', 'a'))
# Generate the Zipf's law plot
zipf_plot(sequences_long)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.