count.words | R Documentation |
Simple way to count how many times each word appears in a text file.
count.words(
file,
wordclump = 1,
ignore.case = TRUE,
stopwords = "",
string,
numbers.keep = TRUE,
...
)
file |
Character string filename, with or without path, for text file to be analyzed. Words assumed to be separated by spaces. |
wordclump |
number of words per clump, so if wordclump=2, it counts how often each 2-word phrase appears. |
ignore.case |
Logical, default TRUE which means not case-sensitive. |
stopwords |
Vector of words to ignore and not count. Default is none, optional. |
string |
A single character string containing text to analyze. Not yet implemented. |
numbers.keep |
Not yet implemented. Would ignore numbers. |
... |
Any other parameters used by |
Returns a data.frame with term (term) and frequencies (freq) sorted by frequency, showing the number of times a given word appears in the file. The rownames are also the words found.
## Not run:
counts <- count.words('speech.txt'); tail(counts, 15)
counts <- count.words('speech.txt', ignore.case=FALSE); head(counts[order(counts$term), ], 15)
counts <- count.words('speech.txt', stopwords=c('The', 'the', 'And', 'and', 'A', 'a'))
tail(counts, 15)
counts <- count.words('speech.txt', 3); tail(counts, 30)
#
counts['the', ]
counts[c('the', 'and', 'notfoundxxxxx'), ] # works only if you are sure all are found
counts[rownames(counts) %in% c('the', 'and', 'notfoundxxxxx'), ]
# that works even if specified word wasn't found
counts[counts$term %in% c('the', 'and', 'notfoundxxxxx'), ]
# that works even if specified word wasn't found
counts <- count.words('C:/mypath/speech.txt')
counts <- count.words('speech.txt', sep='.')
# that is for whole sentences (sort of - splits up at decimal places as well)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.