knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Once installed, stylo2gg will interface with data recorded by the stylo package. The examples below introduce functionality using the eighty-five Federalist Papers, originally published pseudonymously in 1788.

Principal component analysis

As called here, the stylo package limits words to those common to at least 75% of the texts (using the culling... argumements), saves the data in an object called federalist_mfw, and plots the texts based on their word usage with principal component analysis:

# Only run this code chunk interactively, to create the needed files

# library(devtools); load_all()
library(stylo)

federalist_mfw <- 
  stylo(gui = FALSE,
        corpus.dir = system.file("extdata/federalist", package = "stylo2gg"),
        analysis.type = "PCR",
        pca.visual.flavour = "symbols",
        analyzed.features = "w",
        ngram.size = 1,
        display.on.screen = TRUE,
        sampling = "no.sampling",
        culling.max = 75,
        culling.min = 75,
        mfw.min = 900,
        mfw.max = 900,
        write.rds.file = TRUE)

saveRDS(federalist_mfw, "federalist_mfw.rds")
#| label: introstylo1
#| cache: false

library(stylo)

federalist_mfw <- 
  stylo(gui = FALSE,
        corpus.dir = system.file("extdata/federalist", package = "stylo2gg"),
        analysis.type = "PCR",
        pca.visual.flavour = "symbols",
        analyzed.features = "w",
        ngram.size = 1,
        display.on.screen = TRUE,
        sampling = "no.sampling",
        culling.max = 75,
        culling.min = 75,
        mfw.min = 900,
        mfw.max = 900)
# Actually run this code chunk, but don't show the code
federalist_mfw <- readRDS("federalist_mfw.rds")
readRDS("vignettes_PCA_120_MFWs_Culled_75__PCA_.rds")

By default, the stylo2gg() function uses both the data and visualization settings from federalist_mfw:

#| fig-cap: "Using selected `ggplot2` defaults for shapes and colors, the visualization created by `stylo2gg` nevertheless shows the same patterns of style, presenting a figure drawn from the same principal components. Here, the disputed papers are marked by purple diamonds, and they seem closest in style to the parts known to be by Madison, marked by blue Xs."

library(stylo2gg)
federalist_mfw |> 
  stylo2gg()

Other settings are explained in the article on principle component analysis.

Hierarchical clustering

In addition to two-dimensional relationships with principal components, stylo can also show a dendrogram for cluster analysis, showing texts' relationships based on their distance to each other.

# Only run this code chunk interactively, to create the needed files
federalist_mfw2 <- stylo(gui = FALSE,
      corpus.dir = system.file("extdata/federalist", package = "stylo2gg"),
      custom.graph.title = "Federalist Papers",
      analysis.type = "CA",
      analyzed.features = "w",
      ngram.size = 1,
      display.on.screen = TRUE,
      sampling = "no.sampling",
      culling.max = 75,
      culling.min = 75,
      mfw.min = 900,
      mfw.max = 900,
      write.rds.file = TRUE)

saveRDS(federalist_mfw2, "federalist_mfw2.rds")
#| label: stylo_hc
#| cache: false
federalist_mfw2 <- 
  stylo(gui = FALSE,
      corpus.dir = system.file("extdata/federalist", package = "stylo2gg"),
      custom.graph.title = "Federalist Papers",
      analysis.type = "CA",
      analyzed.features = "w",
      ngram.size = 1,
      display.on.screen = TRUE,
      sampling = "no.sampling",
      culling.max = 75,
      culling.min = 75,
      mfw.min = 900,
      mfw.max = 900)
#| fig.height: 10.0
# Actually run this code chunk, but don't show the code
federalist_mfw2 <- readRDS("federalist_mfw2.rds")
readRDS("vignettes_CA_120_MFWs_Culled_75__Classic Delta_.rds")

This federalist_mfw2 object can then be piped into stylo2gg():

#| fig.height: 10.0
#| fig-cap: "As with principal components analysis, `stylo2gg()` function defaults will recreate the chart made by `stylo()`."
federalist_mfw2 |> 
  stylo2gg()

Alternatively, using the unnumbered federalist_mfw object from earlier will create a similar cluster analysis using the option viz="CA":

#| fig.height: 10.0
#| fig-cap: "Function arguments simplify exploration without necessitating additional calls to `stylo()`."

federalist_mfw |> 
  stylo2gg(viz="CA",
           shapes = FALSE)

Additional settings for visualizing clusters with dendrograms are explained in the article on hierarchical clustering.



jmclawson/stylo2gg documentation built on Oct. 24, 2023, 4:54 a.m.