MovieGroupProcess: Movie Group Process Function
In jason-hanser/mgp: Movie Group Process

View source: R/MovieGroupProcess.R

MovieGroupProcess

R Documentation

Movie Group Process Function

Description

This function implements the Movie Group Process outlined by Ying and Wang in their 2014 paper (A Dirichelt Multinomial Mixture Model-based Approach for Short Text Clustering).

Usage

MovieGroupProcess(
  data,
  text,
  K,
  alpha = 0.1,
  beta = 0.1,
  iter = 30,
  repeat_words = FALSE,
  r_stopwords = TRUE
)

Arguments

`data`	A data frame.
`text`	The name of a column within the data frame containing text to cluster. The column name should not be listed in quotes.
`K`	The upper limit for the number of topics. The function will automatically condense and remove empty clusters.
`alpha`	A tuning parameter ranging from 0 to 1 controlling a documents affinity for a larger cluster. Default value is set to 0.1.
`beta`	A tuning parameter ranging from 0 to 1 controlling a documents affinity for a more similar cluster. Default values is set to 0.1.
`iter`	The upper limit for the number of iterations to perform. The function will terminate earlier if a stable solution is found. Default is set at 30.
`repeat_words`	A logical vector indicated whether the documents contain repeated words. If TRUE, the function uses a Algorithm 4 from Yin and Wang's paper; if FALSE, the function using Algorithm 3 from their paper. Default is set to FALSE.
`r_stopwords`	A logical vector indicating whether stop words should be removed. Default is set at TRUE.