summary_corpus: Summary of Corpus
In avkoehl/textprocessingDSI: Clean an arbitrarily large corpus for topic modelling over many cores

Description Usage Arguments Value Examples

A function that calculates word frequency and document frequency for all the words in the corpus. The output can then be analyzed to remove outlier words, or stop words. Handles each file in parallel over the number of cores specified using parlapply. Runs summary_file function on each of the files in the ipath.

1	summary_corpus(ipath, ncores, flag = 0)

`ipath`	A string specifying the path to the input files.
`ncores`	A number specifying the number of cores to use.
`flag`	optional A number specifying if documents are delimited by newline (set to 0) or each document is in a different text file.

A dataframe object that has merged the dataframes for each file. Has term,freq,doccount for each term.

## Not run: 
summary_corpus("/path/to/corpus/", 0)

## End(Not run)

avkoehl/textprocessingDSI documentation built on June 5, 2019, 7:41 p.m.

avkoehl/textprocessingDSI index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

avkoehl/textprocessingDSI
Clean an arbitrarily large corpus for topic modelling over many cores

summary_corpus: Summary of Corpus
In avkoehl/textprocessingDSI: Clean an arbitrarily large corpus for topic modelling over many cores

Description

Usage

Arguments

Value

Examples

Related to summary_corpus in avkoehl/textprocessingDSI...

R Package Documentation

Browse R Packages

We want your feedback!

avkoehl/textprocessingDSI Clean an arbitrarily large corpus for topic modelling over many cores

summary_corpus: Summary of Corpus In avkoehl/textprocessingDSI: Clean an arbitrarily large corpus for topic modelling over many cores

Description

Usage

Arguments

Value

Examples

Related to summary_corpus in avkoehl/textprocessingDSI...

R Package Documentation

Browse R Packages

We want your feedback!

avkoehl/textprocessingDSI
Clean an arbitrarily large corpus for topic modelling over many cores

summary_corpus: Summary of Corpus
In avkoehl/textprocessingDSI: Clean an arbitrarily large corpus for topic modelling over many cores