run_firststage_nbchng: First stage function of the article selection process taking...

View source: R/firststage_functions.R

run_firststage_nbchngR Documentation

First stage function of the article selection process taking in a set of documents and returning a set of potential keywords

Description

First stage function of the article selection process taking in a set of documents and returning a set of potential keywords

Usage

run_firststage_nbchng(
  docs,
  docidvar = "fakeid",
  classvar = "classified",
  typevar = "EV_article",
  textvar = "description",
  stem = TRUE,
  min_termfreq = 20,
  min_docfreq = 20,
  max_termfreq = NULL,
  max_docfreq = NULL,
  remove_punct = TRUE,
  remove_numbers = TRUE,
  remove_hyphens = TRUE,
  termfreq_type = "count",
  docfreq_type = "count",
  dfm_tfidf = FALSE,
  cpoint2 = 0.8,
  cpointchng = 0
)

Arguments

docs

Data frame of documents containing classified cases (R-set) and unclassified cases (S-set)

docidvar

Unique document id variable; default = "fakeid"

classvar

Indicator identifying classified documents; default = "classified"

typevar

Indicator identifying election violence articles; default = "EV_article"

textvar

Indicator identifying text field to classify on; default = "description"

stem

default TRUE

min_termfreq

default 20

min_docfreq

default 20

max_termfreq

default NULL

max_docfreq

default NULL

remove_punct

default TRUE

remove_numbers

default TRUE

remove_hyphens

default TRUE

termfreq_type

default "count"

docfreq_type

default "count"

dfm_tfidf

default FALSE

cpoint2

Cutpoint on predicability of election violence in step 2; default = 0.8

cpointchng

Cutpoint on change in log predictability between step 1 and 2; default =0.0


gidonc/durhamevp documentation built on April 8, 2022, 10:31 a.m.