I explicitly use this package to teach data cleaning, so have refactored my old cleaning code into several scripts. I also include them as compiled Markdown reports. Caveat: these are realistic cleaning scripts! Not the highly polished ones people write with 20/20 hindsight :) I wouldn't necessarily clean it the same way again (and I would download more recent data!), but at this point there is great value in reproducing the data I've been using for ~5 years.
Cleaning history
gdata package. It was kind of painful, due to encoding and other issues. See the scripts in this state in v0.1.0.readxl. This was much less painful. Present day.library(tidyverse) library(stringr) library(knitr) library(here) x <- tibble(fls = list.files(here("data-raw"))) %>% mutate(fls_basename = basename(fls)) %>% separate(fls_basename, c("script", "slug", "ext"), "[_\\.]") x <- x %>% filter( script %>% str_detect("^[0-9]+"), ext %>% str_detect("R|r|md|tsv") ) %>% select(-slug) y <- x %>% group_by(script) %>% nest() collapse_md_links <- function(x) { x %>% { paste0("[", ., "](", ., ")") } %>% paste(collapse = ", ") } jfun <- function(z) { tibble( r_script = z$fls[z$ext == "R"] %>% collapse_md_links(), notebook = z$fls[z$ext == "md"] %>% collapse_md_links(), tsv = z$fls[z$ext == "tsv"] %>% collapse_md_links() ) } y$data %>% map_df(jfun) %>% kable()
devtools::session_info()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.