Description Usage Arguments Value Author(s) Examples
View source: R/detectRareWords.R
This function checks, for each word in a text, how frequently it occurs in a given language. This is useful for eliminating rare words to make a text more accessible to an audience with limited vocabulary. htmlParse
and xpathSApply
from the XML
package are used to process HTML files, if necessary. textToWords
is a helper function that simply breaks down a character vector to a vector of words.
1 2 3 4 5 6 7 8 9 10 | detectRareWords(textFile = NULL,
wordFrequencyFile = "Dutch",
output = c("file", "show", "return"),
outputFile = NULL,
wordCol = "Word", freqCol = "FREQlemma",
textToWordsFunction = "textToWords",
encoding = "ASCII",
xPathSelector = "/text()",
silent = FALSE)
textToWords(characterVector)
|
textFile |
If NULL, a dialog will be shown that enables users to select a file. If not NULL, this has to be either a filename or a character vector. An HTML file can be provided; this will be parsed using |
wordFrequencyFile |
The file with word frequencies to use. If 'Dutch' or 'Polish', files from the Center for Reading Research (http://crr.ugent.be/) are downloaded. |
output |
How to provide the output, as a character vector. If |
outputFile |
The name of the file to store the output in. |
wordCol |
The name of the column in the |
freqCol |
The name of the column in the |
textToWordsFunction |
The function to use to split a character vector, where each element contains one or more words, into a vector where each element is a word. |
encoding |
The encoding used to read and write files. |
xPathSelector |
If the file provided is an HTML file, |
silent |
Whether to suppress detailed feedback about the process. |
characterVector |
A character vector, the elements of which are to be broken down into words. |
detectRareWords
return a dataframe (invisibly) if output
contains return
. Otherwise, NULL is returned (invisibly), but the output is printed and/or written to a file depending on the value of output
.
textToWords
returns a vector of words.
Gjalt-Jorn Peters
Maintainer: Gjalt-Jorn Peters <gjalt-jorn@userfriendlyscience.com>
1 2 3 4 5 6 7 | ## Not run:
detectRareWords(paste('Dit is een tekst om de',
'werking van de detectRareWords',
'functie te demonstreren.'),
output='show');
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.