mediascores: Estimate the mediascores model

View source: R/mediascores.R

mediascoresR Documentation

Estimate the mediascores model

Description

This function is the workhorse of the mediascores library. It fits the model of news media sharing as described in more detail in the library vignette and in the "Model Details" section further below.

Usage

mediascores(
  Y,
  group = NULL,
  anchors,
  user_variance = FALSE,
  variational = FALSE,
  chains = 4,
  cores = getOption("mc.cores", 1L),
  threads = cores,
  iter = 2000,
  warmup = floor(iter/2),
  refresh = 50,
  ...
)

Arguments

Y

matrix or dataframe of dimension n_users x n_domains containing counts of the frequency with which each user (row) shares a given URL domain (column). No missing data are permitted.

group

vector of length n_users indicating group membership of each user. If NULL, every user is assigned to the same group.

anchors

vector of length 2 indicating the index/column position of the anchor domains. The first index defines the meaning of lower values of the scale; the second, upper values. For example, setting the first value to indicate the column representing the New York Times and the second value to represent the FOX News column would render lower values of \vartheta_i and \zeta_m to indicate liberalism; higher values, conservativism.

user_variance

logical, whether to include a variance parameter for each user (i.e. whether to include or exclude \omega_i. Setting this argument to TRUE results in a more computationally demanding model. There are typically too few data to identify these additional parameters.

variational

logical, whether variational inference is used for estimation (rstan::vb). If set to FALSE, the standard NUTS sampler in Stan is used (rstan::sampling). Note: variational Bayes is many orders of magnitude faster, but it is recommended that its estimates be used for final inference unless the size of the data make use of sampling infeasible.

chains

integer, the number of Markov chains to run (for variational = FALSE. The default is 4.

cores

integer, the number of cores to use when running chains in parallel.

threads

integer, the number of total threads to use for within-chain parallelization. Defaults to the value of cores.

iter

integer, the number of total iterations per chain. Defaults to 2000.

warmup

integer, the number of warmup/burnin iterations per chain. Defaults to floor(iter/2).

refresh

integer, the number of iterations per chain before sampling progress on each chain is displayed to the user.

...

additional arguments passed to rstan::sampling (for variational = TRUE) or rstan::vb (for variational = FALSE)

Value

An object of S4 class stanfit (see stanfit-class) representing the fitted results. You can use point_est and rhat to extract point estimates and rhat values for the posterior object.

Model details

The model fit by the mediascores() function is a negative binomial item-response-style model of the following form:

NegBin(\pi_{img}, \omega_i\omega_m)

\pi_{img} = \alpha_i + \gamma_m - ||\vartheta_i - \zeta_m||^2,

where \alpha_i denotes a user-level intercept; \gamma_m, a news media domain intercept; \vartheta_i the sharing-ideology of user i; zeta_m the ideology of news media domain m; and \omega_i and \omega_m, user- and domain-level variance parameters. Details regarding the priors for these parameters are discussed in the library's vignette. A group-specific common prior distribution can be placed on users' news-sharing ideology parameters \vartheta_i through the integer-valued group argument (a vector of integers specifying the affiliation of each user).

Examples

## Not run: 
sim_data <- simulate_data(200, 500)
posterior <- mediascores(sim_data$Y, sim_data$group, sim_data$anchors,
                         variational = FALSE, chains = 2)

## End(Not run)

SMAPPNYU/mediascores documentation built on June 28, 2024, 8:17 a.m.