| left_join.corpus | R Documentation |
left_join() adds columns from y to the corpus x, matching documents
based on document variables. This is a mutating join that keeps all documents
from x and adds matching values from y. If a document in x has no match
in y, the new columns will contain NA.
## S3 method for class 'corpus'
left_join(
x,
y,
by = NULL,
copy = FALSE,
suffix = c(".x", ".y"),
...,
keep = NULL
)
x |
a quanteda corpus object |
y |
a data frame or tibble to join |
by |
a join specification. See |
copy |
if |
suffix |
if there are non-joined duplicate variables in |
... |
other arguments passed to |
keep |
should the join keys from both |
a corpus with document variables from both x and y
This function provides special handling for joining on document names:
If by = "docname" (or "docname" appears in the by vector), the function
will use docnames(x) as the joining column from the corpus, even if
"docname" is not a document variable.
If using join_by(docname == other_col), the function will match
docnames(x) to other_col in y.
If "docname" exists as an actual document variable in x, that variable
will be used instead of docnames(x).
# Create example corpus and data
corp <- data_corpus_inaugural[1:5]
# Create data to join with document names
doc_data <- data.frame(
docname = c("1789-Washington", "1793-Washington", "1797-Adams"),
century = c(18, 18, 18),
speech_number = c(1, 2, 1)
)
# Join using docname - matches docnames(corp) to doc_data$docname
left_join(corp, doc_data, by = "docname") %>%
summary()
# Join using different column names with named vector
doc_data2 <- data.frame(
doc_id = c("1789-Washington", "1793-Washington"),
rating = c(5, 4)
)
left_join(corp, doc_data2, by = c("docname" = "doc_id")) %>%
summary()
# Regular join on existing docvars
year_info <- data.frame(
Year = c(1789, 1793, 1797, 1801, 1805),
decade = c("1780s", "1790s", "1790s", "1800s", "1800s")
)
left_join(corp, year_info, by = "Year") %>%
summary()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.