findMostFreqTerms: Find Most Frequent Terms

Description Usage Arguments Details Value Examples

View source: R/matrix.R

Description

Find most frequent terms in a document-term or term-document matrix, or a vector of term frequencies.

Usage

1
2
3
4
5
findMostFreqTerms(x, n = 6L, ...)
## S3 method for class 'DocumentTermMatrix'
findMostFreqTerms(x, n = 6L, INDEX = NULL, ...)
## S3 method for class 'TermDocumentMatrix'
findMostFreqTerms(x, n = 6L, INDEX = NULL, ...)

Arguments

x

A DocumentTermMatrix or TermDocumentMatrix, or a vector of term frequencies as obtained by termFreq().

n

A single integer giving the maximal number of terms.

INDEX

an object specifying a grouping of documents for rollup, or NULL (default) in which case each document is considered individually.

...

arguments to be passed to or from methods.

Details

Only terms with positive frequencies are included in the results.

Value

For the document-term or term-document matrix methods, a list with the named frequencies of the up to n most frequent terms occurring in each document (group). Otherwise, a single such vector of most frequent terms.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data("crude")

## Term frequencies:
tf <- termFreq(crude[[14L]])
findMostFreqTerms(tf)

## Document-term matrices:
dtm <- DocumentTermMatrix(crude)
## Most frequent terms for each document:
findMostFreqTerms(dtm)
## Most frequent terms for the first 10 the second 10 documents,
## respectively:
findMostFreqTerms(dtm, INDEX = rep(1 : 2, each = 10L))

Example output

Loading required package: NLP

  oil  opec   the  that   was crude 
    4     4     4     3     3     2 
$`127`
   oil    the    its prices  crude    cut 
     5      5      3      3      2      2 

$`144`
 the  oil opec that  and said 
  17   11   10   10    9    9 

$`191`
     the   canada canadian    crude      for      oil 
       4        2        2        2        2        2 

$`194`
  the crude  bbl.  dlrs   for price 
    4     3     2     2     2     2 

$`211`
       the       said        and discounted  estimates        for 
         8          3          2          2          2          2 

$`236`
   the    its kuwait    and    oil    was 
    15      8      8      7      7      7 

$`237`
       the        and     report   economic government     growth 
        30         11          7          6          5          4 

$`242`
  the  were   and   oil  said after 
    6     4     3     3     3     2 

$`246`
       the        and    billion     budget        for government 
        18          9          6          6          6          6 

$`248`
   the    oil prices    and   opec   said 
    27      9      7      6      6      5 

$`273`
  the   mln   bpd  from  last saudi 
   21     9     7     7     7     7 

$`349`
     the      oil      and     arab    crude emirates 
       5        3        2        2        2        2 

$`352`
   the    oil prices  saudi    and accord 
     7      5      4      4      3      2 

$`353`
  oil  opec   the  that   was crude 
    4     4     4     3     3     2 

$`368`
   the  power    oil   ship  after closed 
    11      4      3      3      2      2 

$`489`
        the         and         for         oil       about development 
          8           5           4           4           2           2 

$`502`
  the   and   for   oil  u.s. about 
   13     6     5     4     3     2 

$`543`
    the    1.50    dlrs     for  posted company 
      5       3       3       3       3       2 

$`704`
     the  futures exchange    nymex      and     will 
      21        8        6        6        5        5 

$`708`
 january    1986,     1987  billion    cubic fiscales 
       4        2        2        2        2        2 

$`1`
 the  and  oil said  for  its 
 134   49   46   33   29   28 

$`2`
   the    oil    and    for   said prices 
    95     34     28     21     19     17 

tm documentation built on July 12, 2020, 3 p.m.