knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

ptwikiwords

Words used in Portuguese Wikipedia

Travis-CI Build Status CRAN_Status_Badge

This data-package contains a dataset with words used in a random sample from ~15.000 pages from the Portuguese Wikipedia.

Installing

It can be installed using:

devtools::install_github("dfalbel/ptwikiwords")

Using

After installing the package, you can load the dataset using:

library(ptwikiwords)
data(ptwikiwords)
head(ptwikiwords)

The dataset contains 3 columns:

Here is a wordcloud of those words:

suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(wordcloud))
words_filter <- ptwikiwords %>%
  filter(check == T) %>%
  slice(1:300)
wordcloud(words_filter$word, words_filter$count)

Here is a wordcloud of the 2-grams.

data(ngrams)
words_filter <- ngrams %>%
  slice(1:100)
wordcloud(words_filter$ngrams, words_filter$count)


dfalbel/ptwikiwords documentation built on May 15, 2019, 5:10 a.m.