add_blow: Bind the weighted log odds to a tidy dataset

add_blowR Documentation

Bind the weighted log odds to a tidy dataset

Description

Calculate and bind the log odds ratio, weighted by a Dirichlet prior, of a tidy dataset to the dataset itself. The weighted log odds ratio is added as a column named zeta, with optional columns log_odds, variance, odds, and prob. This functions supports non-standard evaluation through the tidyeval framework.

Usage

add_blow(
  df,
  group,
  feature,
  n,
  topic = NULL,
  .prior = c("empirical", "uninformative", "tidylo"),
  .compare = c("dataset", "groups"),
  .k_prior = 0.1,
  .alpha_prior = 1,
  .complete = FALSE,
  .log_odds = FALSE,
  .se = FALSE,
  .odds = FALSE,
  .prob = FALSE,
  .sort = FALSE
)

Arguments

df

A tidy dataset with one row per feature and set

group

Column of groups between which to compare features, such as documents for text data

feature

Column of features for identifying differences, such as words or bigrams with text data

n

Column containing feature-set counts

topic

(Optional) topic to compare groups within

.prior

Whether prior should be based on g-prior from empirical Bayes, uninformed with set alpha (uninformed), or total frequency count from tidylo implementation

.compare

Whether to compare group-feature to entire dataset or against all other groups

.k_prior

Penalty term for informed prior

.alpha_prior

Frequency of each feature for uninformed prior

.complete

Whether to complete all topic-group-feature combinations

.log_odds

Whether to include point estimate log odds

.se

Whether to include standard error of estimate

.odds

Whether to include odds of seeing feature within group

.prob

Whether to include probability for feature within group

.sort

Whether to sort by largest zeta

Details

The arguments group, feature, n, and topic are passed by expression and support quasiquotation; you can unquote strings and symbols. Grouping is preserved but ignored.

The dataset must have exactly one row per topic-group-feature combination for this calculation to succeed. Read Monroe, Colaresi, and Quinn (2017) for more on the weighted log odds ratio.

Source

https://doi.org/10.1093/pan/mpn018


scottfrechette/funcyfrech documentation built on Aug. 26, 2022, 9:13 a.m.