It is often said that data manipulation alone takes 50-70% time of a data science project.
The duration of this never-ending activity can be attributed to our opaqueness with datasets provided.
Functions in this package enable familiarity with the data.frame to further reduce coding errors and re-work.
# The easiest way to get dplyr is to install from GitHub:
install.packages("dataframeexplorer", dependencies = T)
# Alternatively, you can install development version:
install.packages("devtools")
devtools::install_github("ashrithssreddy/dataframeexplorer")
Functions: [x] Percentiles [x] Level of dataset Univariate Analysis Bivariate Analysis Show progress bar for level_of_dataset Run the level_of_dataset code in parallel for performance
Changes: ~~Return value for all functions to be included into documentation~~ ~~Message not printed in all codes~~ ~~Default filename not consistent~~ Outputs not refined Pep 8 formatting examples not consistent Comments not consistent across all codes sink() to be run in glimpse_to_file upon an error ~~Arguement format to be used: dataset = dataset, output_filename = "dataset_glimpse.txt"~~ Throw a warning when duplicate column names are found ~~Level: Unsink when interrupted~~ ~~Add instructions to interpretation of output~~
1. glimpse_to_file
glimpse_to_file(mtcars, "mtcars_glimpse.txt") or glimpse_to_file(mtcars, "C://Users/Desktop/mtcars_glimpse.txt")
![Output](/man/figures/.png)
Mail ashrithssreddy@gmail.com for suggestions with "dataframeexplorer" in subject line.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.