dtm_sample: Random samples and permutations from a Document-Term-Matrix

View source: R/nlp_flow.R

dtm_sampleR Documentation

Random samples and permutations from a Document-Term-Matrix

Description

Sample the specified number of rows from the Document-Term-Matrix using either with or without replacement.

Usage

dtm_sample(dtm, size = nrow(dtm), replace = FALSE, prob = NULL)

Arguments

dtm

a document term matrix of class dgCMatrix (which can be an object returned by document_term_matrix)

size

a positive number, the number of rows to sample

replace

should sampling be with replacement

prob

a vector of probability weights, one for each row of x

Value

dtm with as many rows as specified in size

Examples

x <- list(doc1 = c("aa", "bb", "cc", "aa", "b"), 
          doc2 = c("bb", "bb", "dd", ""), 
          doc3 = character(),
          doc4 = c("cc", NA), 
          doc5 = character())
dtm <- document_term_matrix(x)
dtm_sample(dtm, size = 2)
dtm_sample(dtm, size = 3)
dtm_sample(dtm, size = 2)
dtm_sample(dtm, size = 8, replace = TRUE)
dtm_sample(dtm, size = 8, replace = TRUE, prob = c(1, 1, 0.01, 0.5, 0.01))

udpipe documentation built on Jan. 6, 2023, 5:06 p.m.