This package provides tools for analyzing the Amazon Customer Reviews Dataset.



Install the development version of amazonreviews from GitHub with:


There are no plans to created a released version on any repository of published packages.

Accessing data

Before analyzing the review data, you must download datasets from Amazon's public repository of review datasets. Amazon makes the datasets available from their S3 service. The files are quite large, so it takes awhile to download each zipped dataset and it takes awhile to load a full dataset into your R environment.

You can either download the datasets manually using the AWS CLI from a command line or tools are provided within this package to perform the download from within your code.

We recommend that you specify an environment variable REVIEW_DIRECTORY that identifies the directory where the datasets are (or will be) stored in your local environment. For example,



See the vignettes for several analyses of the reviews.


Copyright © 2020 Jim Tyhurst Licensed under the Open Software License version 3.0.

jimtyhurst/amazonreviews documentation built on July 10, 2020, 12:31 a.m.