Rstem: Interface to Snowball implementation of Porter's word stemming algorithm.

Share:

An R interface to the C code that implements Porter's word stemming algorithm for collapsing words to a common root to aid comparison of texts. There is code to for different languages (i.e. Danish, Dutch, English, Finnish, French, German, Norwegian, Portuguese, Russian, Spanish, Swedish). However, these may not be applicable if the words require UTF encoding. This is extensible by allowing different routines to be specified to create the C routines used in the stemming, permitting debugging, profiling, pool management, caching, etc.

Author
Duncan Temple Lang [aut], Milan Bouchet-Valat [cre]
Date of publication
2013-04-21 11:55:50
Maintainer
Milan Bouchet-Valat <nalimilan@club.fr>
License
BSD
Version
0.4-1

View on R-Forge

Man pages

getStemLanguages
Query the languages supported in this package
wordStem
Get the common root/stem of words

Files in this package

Rstem/DESCRIPTION
Rstem/NAMESPACE
Rstem/R
Rstem/R/langs.R
Rstem/R/stem.S
Rstem/SPlus
Rstem/Todo.html
Rstem/Web
Rstem/Web/index.html
Rstem/inst
Rstem/inst/scripts
Rstem/inst/scripts/README.html
Rstem/inst/scripts/download
Rstem/inst/words
Rstem/inst/words/english
Rstem/inst/words/english/output.txt
Rstem/inst/words/english/stop.txt
Rstem/inst/words/english/voc.txt
Rstem/inst/words/french
Rstem/inst/words/french/output.txt
Rstem/inst/words/french/stop.txt
Rstem/inst/words/french/voc.txt
Rstem/man
Rstem/man/getStemLanguages.Rd
Rstem/man/wordStem.Rd
Rstem/src
Rstem/src/Languages.h
Rstem/src/Makevars
Rstem/src/api.c
Rstem/src/api.h
Rstem/src/danish_stem.c
Rstem/src/danish_stem.h
Rstem/src/dutch_stem.c
Rstem/src/dutch_stem.h
Rstem/src/english_stem.c
Rstem/src/english_stem.h
Rstem/src/finnish_stem.c
Rstem/src/finnish_stem.h
Rstem/src/french_stem.c
Rstem/src/french_stem.h
Rstem/src/german_stem.c
Rstem/src/german_stem.h
Rstem/src/header.h
Rstem/src/mytest.c
Rstem/src/norwegian_stem.c
Rstem/src/norwegian_stem.h
Rstem/src/portuguese_stem.c
Rstem/src/portuguese_stem.h
Rstem/src/russian_stem.c
Rstem/src/russian_stem.h
Rstem/src/spanish_stem.c
Rstem/src/spanish_stem.h
Rstem/src/stem.h
Rstem/src/swedish_stem.c
Rstem/src/swedish_stem.h
Rstem/src/utilities.c
Rstem/vignettes
Rstem/vignettes/stemming.tex