doStemming: Removes Arabic prefixes and suffixes

View source: R/stemmer.R

doStemmingR Documentation

Removes Arabic prefixes and suffixes

Description

Removes prefixes and suffixes, and can return a list matching the words to stemmed words. Does not stem different forms of Allah.

Usage

doStemming(texts, dontstem =  c('\u0627\u0644\u0644\u0647','\u0644\u0644\u0647'))

Arguments

texts

The original texts.

dontstem

By default, does not stem different forms of Allah

Value

doStemming returns a named list with the following elements:

text

The stemmed text

stemmedWords

A list matching the words and the stemmed words.

Author(s)

Rich Nielsen

Examples

## Create string with Arabic characters
x <- '\u0627\u0644\u0644\u063a\u0629 \u0627\u0644\u0639\u0631\u0628\u064a\u0629
 \u062c\u0645\u064a\u0644\u0629 \u062c\u062f\u0627'

## Remove prefixes and suffixes
y<-doStemming(x)
y$text
y$stemmedWords


arabicStemR documentation built on July 18, 2022, 9:06 a.m.