Provides an interface to perform cluster analysis on a corpus of text. Interfaces to Quanteda to assemble text corpuses easily. Deviationalizes text vectors prior to clustering using technique described by Sherin (Sherin, B. [2013]. A computational study of commonsense science: An exploration in the automated analysis of clinical interview data. Journal of the Learning Sciences, 22(4), 600638. Chicago. <doi:10.1080/10508406.2013.836654>). Uses cosine similarity as distance metric for two stage clustering process, involving Ward's algorithm hierarchical agglomerative clustering, and kmeans clustering. Selects optimal number of clusters to maximize "variance explained" by clusters, adjusted by the number of clusters. Provides plotted output of clustering results as well as printed output. Assesses "model fit" of clustering solution to a set of preexisting groups in dataset.
Package details 


Author  Joshua Rosenberg, Alex Lishinski 
Maintainer  Alex Lishinski <alexlishinski@gmail.com> 
License  GPL3 
Version  0.2.0 
URL  https://github.com/alishinski/clustRcompaR 
Package repository  View on CRAN 
Installation 
Install the latest version of this package by entering the following in R:

Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.