Description Usage Format Source Examples
This is a codetbl_df mapping misspellings of their words, compiled by Wikipedia, where it is licensed under the CC-BY SA license. (Three words with non-ASCII characters were filtered out). If you'd like to reproduce this dataset from Wikipedia, see the example code below.
1 |
An object of class tbl_df
(inherits from tbl
, data.frame
) with 4505 rows and 2 columns.
https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ## Not run:
library(rvest)
library(readr)
library(dplyr)
library(stringr)
library(tidyr)
u <- "https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines"
h <- read_html(u)
misspellings <- h %>%
html_nodes("pre") %>%
html_text() %>%
readr::read_delim(col_names = c("misspelling", "correct"), delim = ">",
skip = 1) %>%
mutate(misspelling = str_sub(misspelling, 1, -2)) %>%
unnest(correct = str_split(correct, ", ")) %>%
filter(Encoding(correct) != "UTF-8")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.