cleanAbstracts: clean data

Description Usage Arguments See Also Examples

View source: R/cleanAbstracts.R

Description

remove Punctuations, remove Numbers, Translate characters to lower or upper case, remove stopwords, remove user specified words, Stemming words.

Usage

1
2
cleanAbstracts(abstracts, rmNum = TRUE, tolw = TRUE, toup = FALSE,
  rmWords = TRUE, yrWords = NULL, stemDoc = FALSE)

Arguments

abstracts

output of getAbstracts, or just a paragraph of text

rmNum

Remove the text document with any numbers in it or not

tolw

Translate characters in character vectors to lower case or not

toup

Translate characters in character vectors to upper case or not

rmWords

Remove a set of English stopwords (e.g., 'the') or not

yrWords

A character vector listing the words to be removed.

stemDoc

Stem words in a text document using Porter's stemming algorithm.

See Also

getAbstracts

Examples

1
2
3
4
5
# Abs=getAbstracts(c("22693232", "22564732"))
# cleanAbs=cleanAbstracts(Abs)

# text="Jobs received a number of honors and public recognition."
# cleanD=cleanAbstracts(text)

Example output



PubMedWordcloud documentation built on May 1, 2019, 8:02 p.m.