findThreshold: Find appropriate threshold range

Description Usage Arguments Value See Also

View source: R/findThreshold.R

Description

This function performs a grid search over potential clustering thresholds to identify a valid range, and inspect the varying levels of aggregation within it.

Usage

1
2
findThreshold(mod, documents_raw=NULL, documents_matrix=NULL, 
              range_min=.05, range_max=5, step=.05)

Arguments

mod

A fitted STM object from stm.

documents_raw

The raw documents used to generate the STM model. A character vector where each entry is the full text of a document.

documents_matrix

Document-term matrix representation of the raw documents, as generated by the prepDocuments function.

range_min

Lower bound of the range to be searched.

range_max

Upper bound of the range to be searched.

step

Step size for the grid search.

Value

A data frame containing the following columns:

  1. threshold: Threshold value.

  2. valid: Binary value; 1 if clustering is successful using given threshold; 0 if not.

  3. juncture_points: Number of juncture points in the resulting clustering tree; -1 if run is unsuccessful. Lower threshold values yield a higher number of juncture points, corresponding to more binary splits and deeper trees. Higher threshold values produce fewer juncture points, corresponding to trees that have significant breadth rather than depth.

See Also

stmCorrViz


stmCorrViz documentation built on May 1, 2019, 8:03 p.m.