In alishinski/clustRcompaR: Easy Interface for Clustering a Set of Documents and Exploring Group- Based Patterns

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-"
)

clustRcompaR

The goal of clustRcompaR is to make it easy to cluster (or group) a series of documents (texts of any length), and to interpret these groups and to describe their frequency across factors, such as between different groups or over time.

Installation

You can install the development version of clustRcompaR from GitHub with:

# install.packages("devtools")
devtools::install_github("alishinski/clustRcompaR")

You can install the stable release on CRAN with:

install.packages("clustRcompaR")

Example

This is a basic example using the built-in inaugural addressess dataset.

First, we use cluster() to cluster the documents into three clusters. We include a new variable, year_before_1900, which we will later use to compare frequencies across clusters. Then we use extract_terms() to view the terms and term frequencies in the two clusters.

First, let's process the texts.

library(clustRcompaR)
library(dplyr)

d <- inaugural_addresses
d <- mutate(d, century = ifelse(Year < 1800, "17th",
                                ifelse(Year >= 1800 & Year < 1900, "18th",
                                       ifelse(Year >= 1900 & Year < 2000, "19th", "20th"))))

Next, we cluster the texts.

three_clusters <- cluster(d, century, n_clusters = 3)
extract_terms(three_clusters)

Then, we use the compare() function to compare the frequency of clusters across a factor, in this case, the century. We can then use the compare_plot() or compare_test() (which uses a Chi-Square test) function.

Here, we can compare the texts.

three_clusters_comparison <- compare(three_clusters, "century")
compare_plot(three_clusters_comparison)

alishinski/clustRcompaR documentation built on May 12, 2019, 9:54 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

alishinski/clustRcompaR
Easy Interface for Clustering a Set of Documents and Exploring Group- Based Patterns

In alishinski/clustRcompaR: Easy Interface for Clustering a Set of Documents and Exploring Group- Based Patterns

clustRcompaR

Installation

Example

R Package Documentation

Browse R Packages

We want your feedback!

alishinski/clustRcompaR Easy Interface for Clustering a Set of Documents and Exploring Group- Based Patterns

In alishinski/clustRcompaR: Easy Interface for Clustering a Set of Documents and Exploring Group- Based Patterns

clustRcompaR

Installation

Example

R Package Documentation

Browse R Packages

We want your feedback!

alishinski/clustRcompaR
Easy Interface for Clustering a Set of Documents and Exploring Group- Based Patterns