textreg: n-Gram Text Regression, aka Concise Comparative Summarization

Share:

Function for sparse regression on raw text, regressing a labeling vector onto a feature space consisting of all possible phrases.

Author
Luke Miratrix
Date of publication
2015-11-11 16:46:48
Maintainer
Luke Miratrix <lmiratrix@stat.harvard.edu>
License
GPL (>= 2)
Version
0.1.3

View on CRAN

Man pages

bathtub
Sample of cleaned OSHA accident summaries.
build.corpus
Build a corpus that can be used in the textreg call.
calc.loss
Calculate total loss of model (Squared hinge loss).
clean.text
Clean text and get it ready for textreg.
cluster.phrases
Cluster phrases based on similarity of appearance.
convert.tm.to.character
Convert tm corpus to vector of strings.
cpp_build.corpus
Driver function for the C++ function.
cpp_textreg
Driver function for the C++ function.
dirtyBathtub
Sample of raw-text OSHA accident summaries.
find.CV.C
K-fold cross-validation to determine optimal tuning parameter
find.threshold.C
Conduct permutation test on labeling to get null distribution...
grab.fragments
Grab all fragments in a corpus with given phrase.
is.fragment.sample
Is object a fragment.sample object?
is.textreg.corpus
Is object a textreg.corpus object?
is.textreg.result
Is object a textreg.result object?
list.table.chart
Graphic showing multiple word lists side-by-side.
make.appearance.matrix
Make phrase appearance matrix from textreg result.
make.count.table
Count number of times documents have a given phrase.
make.CV.chart
Plot K-fold cross validation curves
make.list.table
Collate multiple regression runs.
make.path.matrix
Generate matrix describing gradient descent path of textreg.
make.phrase.correlation.chart
Generate visualization of phrase overlap.
make.phrase.matrix
Make a table of where phrases appear in a corpus
make_search_phrases
Convert phrases to appropriate search string.
make.similarity.matrix
Calculate similarity matrix for set of phrases.
path.matrix.chart
Plot optimization path of textreg.
phrase.count
Count phrase appearance.
phrase.matrix
Make matrix of where phrases appear in corpus.
plot.textreg.result
Plot the sequence of features as they are introduced with the...
predict.textreg.result
Predict labeling with the selected phrases.
print.fragment.sample
Pretty print results of phrase sampling object.
print.textreg.corpus
Pretty print textreg corpus object
print.textreg.result
Pretty print results of textreg regression.
reformat.textreg.model
Clean up output from textreg.
sample.fragments
Sample fragments of text to contextualize a phrase.
save.corpus.to.files
Save corpus to text (and RData) file.
stem.corpus
Step corpus with annotation.
testCorpora
Some small, fake test corpora.
textreg
Sparse regression of labeling vector onto all phrases in a...
textreg-package
Sparse regression package for text that allows for multiple...
tm_gregexpr
Call gregexpr on the content of a tm Corpus.

Files in this package

textreg
textreg/inst
textreg/inst/doc
textreg/inst/doc/bathtub_vignette.R
textreg/inst/doc/bathtub_vignette.pdf
textreg/inst/doc/bathtub_vignette.Rnw
textreg/inst/test-all.R
textreg/src
textreg/src/Makevars
textreg/src/textreg.cpp
textreg/src/Makevars.win
textreg/NAMESPACE
textreg/data
textreg/data/testCorpora.RData
textreg/data/bathtub.RData
textreg/data/dirtyBathtub.RData
textreg/R
textreg/R/prediction_code.R
textreg/R/cross_validation_code.R
textreg/R/vizualize_phrases.R
textreg/R/sequenceplotter.R
textreg/R/package_and_data_documentation.R
textreg/R/textreg.R
textreg/R/clean_text.R
textreg/R/stempp.R
textreg/R/text_searching.R
textreg/R/make_word_lists.R
textreg/vignettes
textreg/vignettes/bathtub_vignette.Rnw
textreg/MD5
textreg/build
textreg/build/vignette.rds
textreg/DESCRIPTION
textreg/man
textreg/man/convert.tm.to.character.Rd
textreg/man/save.corpus.to.files.Rd
textreg/man/build.corpus.Rd
textreg/man/textreg.Rd
textreg/man/cluster.phrases.Rd
textreg/man/make.phrase.matrix.Rd
textreg/man/phrase.matrix.Rd
textreg/man/textreg-package.Rd
textreg/man/testCorpora.Rd
textreg/man/print.textreg.corpus.Rd
textreg/man/find.threshold.C.Rd
textreg/man/make.similarity.matrix.Rd
textreg/man/calc.loss.Rd
textreg/man/bathtub.Rd
textreg/man/make.list.table.Rd
textreg/man/find.CV.C.Rd
textreg/man/stem.corpus.Rd
textreg/man/make.count.table.Rd
textreg/man/cpp_textreg.Rd
textreg/man/print.textreg.result.Rd
textreg/man/dirtyBathtub.Rd
textreg/man/make.CV.chart.Rd
textreg/man/print.fragment.sample.Rd
textreg/man/make.path.matrix.Rd
textreg/man/grab.fragments.Rd
textreg/man/is.textreg.corpus.Rd
textreg/man/make.phrase.correlation.chart.Rd
textreg/man/sample.fragments.Rd
textreg/man/path.matrix.chart.Rd
textreg/man/list.table.chart.Rd
textreg/man/is.fragment.sample.Rd
textreg/man/make_search_phrases.Rd
textreg/man/make.appearance.matrix.Rd
textreg/man/reformat.textreg.model.Rd
textreg/man/predict.textreg.result.Rd
textreg/man/tm_gregexpr.Rd
textreg/man/clean.text.Rd
textreg/man/phrase.count.Rd
textreg/man/plot.textreg.result.Rd
textreg/man/is.textreg.result.Rd
textreg/man/cpp_build.corpus.Rd