knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Welcome to episcout!

episcout has many functions that can be used to quickly explore data sets. It is particularly useful during cleaning, describing and visualising data sets of tens of thousands of rows with tens to hundreds of columns.

It was developed using a combination of tidyverse packages, base R functions and data.table.

It suggests many packages but does not force you to import them as there are many functions available.

If you want to install all of them however, do:

# Remove 'eval = FALSE' to run
install.packages(c('dplyr',
                   'tibble',
                   'tidyr',
                   'data.table',
                   'compare',
                   'stringi',
                   'stringr',
                   'lubridate',
                   'purrr',
                   'e1071',
                   'Hmisc',
                   'ggplot2',
                   'cowplot',
                   'scales',
                   'ggthemes',
                   'future',
                   'doFuture',
                   'foreach',
                   'iterators',
                   'magrittr',
                   'reshape2'
           )
         )

Many parameters are given as defaults, admittedly with quite a bit of personal preference, but with the aim of processing hundreds of thousands of observations from hundreds of variables from multiple data sets a bit faster. Hence, convenience and standardisation are often preferred. This may not be your case however but it is simple enough to defer to your preferred R packages.

Below are a number of examples of how to use episcout functions.

All functions start with "epi_".

Currently there are functions for: - pre-processing: epi_clean_* - descriptive statistics: epi_stats_* - visualising: epi_plot_* - various: epi_read() ; epi_write() etc.

A few examples with dummy data

# Test set df:
n <- 20
df <- data.frame(
    var_id = rep(1:(n / 2), each = 2),
  var_to_rep = rep(c('Pre', 'Post'), n / 2),
    x = rnorm(n),
    y = rbinom(n, 1, 0.50),
    z = rpois(n, 2)
    )
df

Just starting this vignette...

Cleaning

Describing

Visualising



AntonioJBT/episcout documentation built on Nov. 7, 2019, 5:34 p.m.