JSTOR_findassocs: Plot the words with the strongest correlation with a given...
In benmarwick/JSTORr: Simple Text Mining and Document Clustering of JSTOR Journal Articles

Description Usage Arguments Value Examples

Generates a plot of the top n words in all the documents that positively correlate with a given word, in ranges of years. For use with JSTOR's Data for Research datasets (http://dfr.jstor.org/). For best results, repeat the function after adding common words to the stopword list. To learn more about editing the stopword list, see the help for the JSTOR_dtmofnouns function.

1 2	JSTOR_findassocs(unpack1grams, nouns, word, n = 5, corlimit = 0.4, plimit = 0.05, topn = 20, biggest = 5, parallel = FALSE)

`unpack1grams`	object returned by the function JSTOR_unpack1grams.
`nouns`	the object returned by the function JSTOR_dtmofnouns. A Document Term Matrix containing the documents.
`word`	The word to calculate the correlations with
`n`	the number years to aggregate documents by. For example, n = 5 (the default value) will create groups of all documents published in non-overlapping five year ranges. Note that high n values combined with high plimit and corlimit values will severly filter the output. For exploratory data analysis it's recommended to start with low n values and work up.
`corlimit`	The lower threshold value of the Pearson correlation statistic (default is 0.4).
`plimit`	The lower threshold value of the Pearson correlation statistic (default is 0.05).
`topn`	An integer for the number of top ranking words to plot. For example, topn = 20 (the default value) will plot the top 20 words for each range of years.
`biggest`	An integer to control the maximum size of the text in the plot
`parallel`	logical. If TRUE attempts to run the function on multiple cores. Note that this may actually be slower if you have one core, limited memory or if the data set is small due to communication of data between the cores.

Returns a plot of the most frequent words per year range, with word size scaled to frequency, and a dataframe with words and counts for each year range

1
2
3

## findassocs <- JSTOR_findassocs(unpack1grams, nouns, "rouges")
## findassocs <- JSTOR_findassocs(unpack1grams, nouns, n = 10, "pirates", topn = 100)
## findassocs <- JSTOR_findassocs(unpack1grams, nouns, n = 5, "marines", corlimit=0.6, plimit=0.001)

benmarwick/JSTORr documentation built on May 12, 2019, 12:59 p.m.

benmarwick/JSTORr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

benmarwick/JSTORr
Simple Text Mining and Document Clustering of JSTOR Journal Articles

JSTOR_findassocs: Plot the words with the strongest correlation with a given...
In benmarwick/JSTORr: Simple Text Mining and Document Clustering of JSTOR Journal Articles

Description

Usage

Arguments

Value

Examples

Related to JSTOR_findassocs in benmarwick/JSTORr...

R Package Documentation

Browse R Packages

We want your feedback!

benmarwick/JSTORr Simple Text Mining and Document Clustering of JSTOR Journal Articles

JSTOR_findassocs: Plot the words with the strongest correlation with a given... In benmarwick/JSTORr: Simple Text Mining and Document Clustering of JSTOR Journal Articles

Description

Usage

Arguments

Value

Examples

Related to JSTOR_findassocs in benmarwick/JSTORr...

R Package Documentation

Browse R Packages

We want your feedback!

benmarwick/JSTORr
Simple Text Mining and Document Clustering of JSTOR Journal Articles

JSTOR_findassocs: Plot the words with the strongest correlation with a given...
In benmarwick/JSTORr: Simple Text Mining and Document Clustering of JSTOR Journal Articles