galbraith: Table of word frequencies (Galbraith, Rowling, Cobenm,...

Description Usage Details Source Examples

Description

This dataset contains a table (matrix) of relative frequencies of 3000 most frequent words retrieved from 26 books by 5 authors, including the novel "Cuckoo's Calling" by a mysterious Robert Galbraith that turned out to be J.K. Rowling. The remaining authors are as follows: Harlan Coben ("Deal Breaker", "Drop Shot", "Fade Away", "One False Move", "Gone for Good", "No Second Chance", "Tell No One"), C.S. Lewis ("The Last Battle", "Prince Caspian: The Return to Narnia", "The Silver Chair", "The Horse and His Boy", "The Lion, the Witch and the Wardrobe", "The Magician's Nephew", "The Voyage of the Dawn Treader"), J.K. Rowling ("The Casual Vacancy", "Harry Potter and the Chamber of Secrets", "Harry Potter and the Goblet of Fire", "Harry Potter and the Deathly Hallows", "Harry Potter and the Order of the Phoenix", "Harry Potter and the Half-Blood Prince", "Harry Potter and the Prisoner of Azkaban", "Harry Potter and the Philosopher's Stone"), and J.R.R. Tolkien ("The Fellowship of the Ring", "The Two Towers", "The Return of the King").

Usage

1
data("galbraith")

Details

The word frequencies are represented as a two-dimensional table: variables (words) in columns, samples (novels) in rows. The frequencies are relative, i.e. the number of occurrences of particular word type was divided by the total number of tokens in a given text.

Source

The novels represented by this dataset are protected by copyright. For that reason, it was not possible to provide the actual texts. Instead, the frequences of the most frequent words are obtained – and those can be freely distributed.

Examples

1
2
3
4
5
6
7
data(galbraith)
rownames(galbraith)

## Not run: 
stylo(frequencies = galbraith, gui = FALSE)

## End(Not run)

Example output

### stylo version: 0.6.9 ###

If you plan to cite this software (please do!), use the following reference:
    Eder, M., Rybicki, J. and Kestemont, M. (2016). Stylometry with R:
    a package for computational text analysis. R Journal 8(1): 107-121.
    <https://journal.r-project.org/archive/2016/RJ-2016-007/index.html>

To get full BibTeX entry, type: citation("stylo")
Warning message:
no DISPLAY variable so Tk is not available 
 [1] "coben_breaker"        "coben_dropshot"       "coben_fadeaway"      
 [4] "coben_falsemove"      "coben_goneforgood"    "coben_nosecondchance"
 [7] "coben_tellnoone"      "galbraith_cuckoos"    "lewis_battle"        
[10] "lewis_caspian"        "lewis_chair"          "lewis_horse"         
[13] "lewis_lion"           "lewis_nephew"         "lewis_voyage"        
[16] "rowling_casual"       "rowling_chamber"      "rowling_goblet"      
[19] "rowling_hallows"      "rowling_order"        "rowling_prince"      
[22] "rowling_prisoner"     "rowling_stone"        "tolkien_lord1"       
[25] "tolkien_lord2"        "tolkien_lord3"       
using current directory...


culling @ 0	available features (words) 3000
Calculating z-scores... 

Calculating classic Delta distances...
MFW used: 
100 
Processing metadata...


Assigning plot colors according to file names...

 

Function call:
stylo(gui = FALSE, frequencies = galbraith)

Depending on your chosen options, some results should have been written
into a few files; you should be able to find them in your current
(working) directory. Usually, these include a list of words/features
used to build a table of frequencies, the table itself, a file containing
recent configuration, etc.

Advanced users: you can pipe the results to a variable, e.g.:
	 publishable.results = stylo()
this will create a class "publishable.results" containing some presumably
interesting stuff. The class created, you can type, e.g.:
	 summary(publishable.results)
to see which variables are stored there and how to use them.


for suggestions how to cite this software, type: citation("stylo")

stylo documentation built on Dec. 6, 2020, 5:06 p.m.

Related to galbraith in stylo...