knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

Guesses the formatting of tabular/flat files by testing different options for formatting the delimiter, decimal mark, grouping mark and column header. Chooses the combinations that creates a data frame that make most sense, based on the assumptions:

For delimiters following are possible: tab (\t), comma (,), semicolon (;) and whitespace/fixed width. As decimal mark: comma (,) and dot (.). As big number grouping mark: comma (,), dot (.) and space ( ) are tested. Column headers existence is also tested for, altogether 22 possible formatting combinations are tested.

This Shiny app https://rokka.shinyapps.io/smartep demonstrates how this package (together with https://github.com/ijlyttle/shinypod) can be used to create an tabular/flat file reader that you can throw almost any file at - and it will in most cases guess the right formatting for you.

Quick start

Install

Dev version from GitHub.

devtools::install_github("lukas-rokka/readrGuess")
library(readr)
library(readrGuess)

Example

# Print all formatting combinations that are currently tested
delim_cases()

# Create a date frame and format it into a semicolon delimited string
test_str <- readr::format_delim(
  data.frame(a=runif(1000, -100, 100), b="a", c=1, d=TRUE), 
  ";")

# Read the string and guess the right formatting
read_guess(test_str)


lukas-rokka/readrGuess documentation built on May 21, 2019, 8:57 a.m.