knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  progress = FALSE,
  error = FALSE, 
  message = FALSE,
  warning = FALSE
)

options(digits = 2)

A fast version of the Rapid Automatic Keyword Extraction (RAKE) algorithm

Linux Build Status CRAN version

Installation

You can get the stable version on CRAN:

install.packages("rapidraker")

The development version of the package requires you to compile the latest Java source code in rapidrake-java, so installing it is not as simple as making a call to devtools::install_github().

What is rapidraker?

rapidraker is an R package that provides an implementation of the same keyword extraction algorihtm (RAKE) as slowraker. However, rapidraker::rapidrake() is written in Java, whereas slowraker::slowrake() is written in R. This means that you can expect rapidrake() to be considerably faster than slowrake().

Usage

rapidrake() has the same arguments as slowrake(), and both functions output the same type of object. You can therefore substitue rapidrake() for slowraker() without making any additional changes to your code.

library(slowraker)
library(rapidraker)

data("dog_pubs")
rakelist <- rapidrake(txt = dog_pubs$abstract[1:5])
# Note, we have to split the README.Rmd up like this so that it doesn't print 
# the progress bar.
library(slowraker)
library(rapidraker)
options(width = 100, digits = 2)

data("dog_pubs")
rakelist <- rapidrake(txt = dog_pubs$abstract[1:5])

rapidrake() outputs a list of data frames. Each data frame contains the keywords that were extracted for an element of txt:

rakelist

You can bind these data frames together using slowaker::rbind_rakelist():

rakedf <- rbind_rakelist(rakelist, doc_id = dog_pubs$doi[1:5])
head(rakedf, 5)

Learning more



crew102/rapidraker documentation built on June 7, 2021, 3:05 p.m.