dfm_subset: Extract a subset of a dfm

Description Usage Arguments Details Value See Also Examples

View source: R/dfm_subset.R

Description

Returns document subsets of a dfm that meet certain conditions, including direct logical operations on docvars (document-level variables). dfm_subset functions identically to subset.data.frame(), using non-standard evaluation to evaluate conditions based on the docvars in the dfm.

Usage

1

Arguments

x

dfm object to be subsetted

subset

logical expression indicating the documents to keep: missing values are taken as false

...

not used

Details

To select or subset features, see dfm_select() instead.

When select is a dfm, then the returned dfm will be equal in document dimension and order to the dfm used for selection. This is the document-level version of using dfm_select() where pattern is a dfm: that function matches features, while dfm_subset will match documents.

Value

dfm object, with a subset of documents (and docvars) selected according to arguments

See Also

subset.data.frame()

Examples

1
2
3
4
5
6
7
8
corp <- corpus(c(d1 = "a b c d", d2 = "a a b e",
                 d3 = "b b c e", d4 = "e e f a b"),
               docvars = data.frame(grp = c(1, 1, 2, 3)))
dfmat <- dfm(corp)
# selecting on a docvars condition
dfm_subset(dfmat, grp > 1)
# selecting on a supplied vector
dfm_subset(dfmat, c(TRUE, FALSE, TRUE, FALSE))

koheiw/quanteda.core documentation built on Sept. 21, 2020, 3:44 p.m.