wrap_documents: Wrap tokens into document html strings

Description Usage Arguments Value Examples

View source: R/wrap_documents.r

Description

Pastes the tokens into articles, and returns an <article> html element.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
wrap_documents(
  tokens,
  meta,
  doc_col = "doc_id",
  token_col = "token",
  space_col = NULL,
  nav = doc_col,
  token_nav = NULL,
  top_nav = NULL,
  thres_nav = NULL,
  drop_missing_meta = FALSE
)

Arguments

tokens

A data.frame with a column for document ids (doc_col) and a column for tokens (token_col)

meta

A data.frame with a column for document_ids (doc_col). All other columns are added to the browser as document meta

doc_col

The name of the document id column

token_col

The name of the token column

space_col

Optionally, a column with space indications (e.g., newline) per token (which is how some NLP parsers indicate spaces)

nav

The column in meta used for nav. Defaults to 'doc_id'

token_nav

Alternative to nav (which uses meta), a column in tokens used for navigation

top_nav

If token_nav is used, navigation filters will only apply to the top x values with highest token occurence in a document

thres_nav

Like top_nav, but specifying a threshold for the minimum number of tokens.

drop_missing_meta

if TRUE, omit missing meta rows instead of printing empty value

Value

A named vector, with document ids as names and the document html strings as values

Examples

1
2
3
docs = wrap_documents(sotu_data$tokens, sotu_data$meta)
head(names(docs))
docs[[1]]

kasperwelbers/tokenbrowser documentation built on May 3, 2021, 8:33 a.m.