README.md

RClean

Nicholas Uhorchak 2018-03-05

Build Status AppVeyor Build Status

Section 1 Basic Information

1.1 Name

RClean

1.2 Title

RCleaner, an interactive data cleaning tool provides users the dynamic ability to import and clean data. At its core, it provides R users functionality similar to that of Microsoft Excel with regards to preparation of a dataset for analysis.

1.3 Description

1.3A Features

Utilizing R to import and clean data is often a time consuming task. Without preparation of the dataset in excel or other software, R users must use scripts or command line R code for this task. The Interactive Data Cleaning tool will afford users the ability to do the following:

1.3B End users

This analytic is being developed for those users in need of hasty data cleaning or those who would otherwise not wish to spend a large amount of time writing code to prepare data for analysis. Typical users will have working knowledge of R, however prefer the point and click abilities of Microsoft Excel or other similar software.

1.3C Required knowledge/skills

Users must be able to navigate R studio and understand how to use an R Gadget. In addition, they should be aware of the types of data contained in the dataset to be analyzed, whether numerical or categorical, such that they are aware of the application of some functions of this analytic tool.

1.3D Statistical methods utilized

1.3E R Packages utilized

This analytic will utilize the following existing R packages:

1.4 End user access

End users will call this gadget from the associated R package

1.5 Security concerns

None

1.6 Design constraints

Currently, the gadget only handles DF, matrix or tibble like objects with 2 or more columns. Single vectors are not handled.

Section 2 Delivery and Schedule Information

2.1 Feature Review

Feature Description Rank Status Value to user Inputs Outputs Use? Time? Current or future version Visual inspection of data This feature will open the newly imported DF so the user can look at the data 1 COMPLETE Quick and easy visual exploration of the dataset imported Some dataset Dataset output onto screen Visual exporation of data Yes Current Select releveant data columns to retain/remove Allow the user to select what columns to either retain or remove from the current data 2 COMPLETE Easily remove unwanted variables from the dataset button click Modified DF Data cleaning Yes Current Select releveant data rows to retain/remove Allow the user to select what rows to either retain or remove from the current data 3 COMPLETE Easily remove unwanted rows from the dataset button click Modified DF Data cleaning Yes Current Save clean data User can save the "clean" data to a new dataframe in R 4 COMPLETE Cleaned data saved for analysis new name for clean DF Clean DF Save cleaned DF for future use Yes Current Scale Data Allow the user to scale the data 5 COMPLETE Scale the data for future use button click Modified DF Data prep No Current Mean center data Allow the user to center the data 6 COMPLETE Mean center the data for future use button click Modified DF Data prep No Current Rename columns Allow the user to rename columns in the DF 7 COMPLETE Rename columns if necessary Column names if necessary Modified DF Data cleaning No Future Create indicator variables Allow the user to create "dummy" variables to represent nominal data 8 COMPLETE Create indicator variables Variables to encode Modified DF Data prep No Future Write "clean" data to excel Allow user to write the clean data to new excel file 9 COMPLETE Clean data is saved into external file for future use file location excel document save file as excel doc for future use No Future Modify DF cells Allow users to click on a cell and change data values 10 not started single cell value modification N/A modified DF change cells No Future Impute missing values Allow the user to impute missing values 11 not started NA Method of imputation Modified DF Data prep No Future

nuhorchak/RClean documentation built on May 31, 2019, 2:50 p.m.