sim.wordfish: Simulate data and parameters for a Wordfish model

Description Usage Arguments Details Value Author(s)

View source: R/wordfish.R

Description

Simulates data and returns parameter values using Wordfish model assumptions: Counts are sampled under the assumption of independent Poisson draws with log expected means linearly related to a lattice of document positions.

Usage

1
2
3
4
5
6
7
sim.wordfish(
  docs = 10,
  vocab = 20,
  doclen = 500,
  dist = c("spaced", "normal"),
  scaled = TRUE
)

Arguments

docs

How many ‘documents’ should be generated

vocab

How many ‘word’ types should be generated

doclen

A scalar ‘document’ length or vector of lengths

dist

the distribution of ‘document’ positions

scaled

whether the document positions should be mean 0, unit sd

Details

This function draws ‘docs’ document positions from a Normal distribution, or regularly spaced between 1/‘docs’ and 1.

‘vocab’/2 word slopes are 1, the rest -1. All word intercepts are 0. ‘doclen’ words are then sampled from a multinomial with these parameters.

Document position (theta) is sorted in increasing size across the documents. If ‘scaled’ is true it is normalized to mean zero, unit standard deviation. This is most helpful when dist=normal.

Value

Y

A sample word-document matrix

theta

The ‘document’ positions

doclen

The ‘document’ lengths

beta

‘Word’ intercepts

psi

‘Word’ slopes

Author(s)

Will Lowe


conjugateprior/austin documentation built on May 11, 2021, 2:46 a.m.