Work in (early) progress. Probably don't even look at it. Consider it pure magic that is not to be tempered with.

Share:

Description

In some future release, this might evolve into a function to help comparing several texts by features like average sentece length, word length, lexical diversity, and so forth. The idea behind it is to conduct a cluster analysis, to discover which texts out of several are similar to (or very different from) each other. This can be useful, e.g., if you need texts for an experiment which are different in content, but similar regarding syntactic features, like listed above.

Usage

1
kRp.cluster(txts, lang, TT.path, TT.preset)

Arguments

txts

A character vector with paths to texts to analyze.

lang

A character string with a valid Language identifier.

TT.path

A character string, path to TreeTagger installation.

TT.preset

A character string naming the TreeTagger preset to use.

Details

It is included in this package not really to be used, but to maybe inspire you, to toy around with the code and help me to come up with something useful in the end...

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.