cluster: Cluster wrapper function

Description Usage Arguments Details Examples

View source: R/clustRcompaR.R

Description

Cluster wrapper function

Usage

1
2
cluster(data, ..., n_clusters, minimum_term_frequency = 3, min_terms = 3,
  num_terms = 10, stopwords = NULL, remove_twitter = FALSE)

Arguments

data

The data frame comparing the text vector as the first column

...

Additional columns of the data frame containing metadata cfor comparison

n_clusters

The number of clusters to be used for the clustering solution

minimum_term_frequency

The minimum number of occurences for a term to be included

min_terms

The minimum number of terms for a document to be included

num_terms

Number of terms to display in clustering summary output

stopwords

Additional stopwords to exclude from clustering analysis

remove_twitter

Whether to remove text associated with Twitter content, useful for when analyzing data from this source (defaults to FALSE)

Details

Performs the clustering half of the process, including assembling and cleaning the corpus, deviationalizing and clustering.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
library(clustRcompaR)
library(dplyr)
library(quanteda)

d <- inaugural_addresses
d <- mutate(d, century = ifelse(Year < 1800, "17th",
                                ifelse(Year >= 1800 & Year < 1900, "18th",
                                       ifelse(Year >= 1900 & Year < 2000, "19th", "20th"))))

three_clusters <- cluster(d, century, n_clusters = 3)
extract_terms(three_clusters)

three_clusters_comparison <- compare(three_clusters, "century")
compare_plot(three_clusters_comparison)

clustRcompaR documentation built on May 1, 2019, 11:16 p.m.