The goal of ‘meltr’ is to provide a fast and friendly way to read non-rectangular data (like ragged forms of ‘csv’, ‘tsv’, and ‘fwf’).
Standard tools like
readr::read_csv()
can cope to some extent with unusual inputs, like files with empty rows
or newlines embedded in strings. But some files are so wacky that
standard tools don’t work at all, and instead you have to take the file
to pieces and reassemble to get structured data you can work with.
The meltr package provides tools to do this.
You can install the released version of meltr from CRAN with:
install.packages("meltr")
Or you can install the development version with:
# install.packages("devtools")
devtools::install_github("r-lib/meltr")
Here’s a contrived example that breaks two assumptions made by common
tools like readr::read_csv()
.
In contrast, the melt_csv()
function reads the file one cell at a
time, importing each cell of the file into a whole row of the final data
frame.
writeLines("Help,,007,I'm
1960-09-30,FALSE,trapped in,7,1.21
non-rectangular,data,NA", "messy.csv")
library(meltr)
melt_csv("messy.csv")
#> # A tibble: 12 × 4
#> row col data_type value
#> <dbl> <dbl> <chr> <chr>
#> 1 1 1 character Help
#> 2 1 2 missing <NA>
#> 3 1 3 character 007
#> 4 1 4 character I'm
#> 5 2 1 date 1960-09-30
#> 6 2 2 logical FALSE
#> 7 2 3 character trapped in
#> 8 2 4 integer 7
#> 9 2 5 double 1.21
#> 10 3 1 character non-rectangular
#> 11 3 2 character data
#> 12 3 3 missing <NA>
The output of melt_csv()
gives us:
data_type
column merely
gives meltr’s best guess of what the data types ought to be.What are some ways you can you use this? To begin with, you can do some simple manipulations with ordinary functions.
For example you could extract the words.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data <- melt_csv("messy.csv")
data %>%
filter(data_type == "character")
#> # A tibble: 6 × 4
#> row col data_type value
#> <dbl> <dbl> <chr> <chr>
#> 1 1 1 character Help
#> 2 1 3 character 007
#> 3 1 4 character I'm
#> 4 2 3 character trapped in
#> 5 3 1 character non-rectangular
#> 6 3 2 character data
Or find if there are missing entries.
data %>%
filter(data_type == "missing")
#> # A tibble: 2 × 4
#> row col data_type value
#> <dbl> <dbl> <chr> <chr>
#> 1 1 2 missing <NA>
#> 2 3 3 missing <NA>
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.