knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The goal of nacleanR is to provide functions that aide in data cleansing. It comprises of functions to
Probe the percentage of missing values within the variables
Find valid & invalid variables from point of view of percentage of missing data
Remove Variables with Missing Values above user defined limit from a dataset
Calculate age variable in years from an existing calendar year variable in dataset by subtracting year variable from System Date
Update:Package is currently not available on CRAN
Please use GitHub to install development version
You can install the released version of nacleanr from CRAN with:
install.packages("nacleanR")
And the development version from GitHub with:
# install.packages("devtools") devtools::install_github("anandjage/nacleanR")
perecnt_na(dataset)
This is a basic example which shows how to find percentage of missing values in the form of NA in each variable:
library(nacleanR) ## basic example code csv <- system.file("extdata", 'nadata.csv', package = "nacleanR") sample_data <- read_data(path = csv) percent_na(sample_data)
invalidcols(dataset,threshold)
This is a basic example which shows variables that contain missing values NA above user defined threshold
nacleanR::invalidcols(data = sample_data,threshold = 50)
validcols(dataset,threshold)
This is a basic example which shows variables that contain missing values NA in the form within the user defined threshold.
nacleanR::validcols(data = sample_data,threshold = 50)
select_cols(dataset,threshold)
This is a basic example which returns dataset after removing variables that contain missing values above the user defined threshold.
new_data <- nacleanR::select_cols(data = sample_data,threshold = 50) new_data
age_cal(dataset,variable)
Calculates age by subtracting a year vector variable from current system year. Creates a new vector in dataset.
csv = system.file("extdata", 'agedata.csv', package = "nacleanR") agedata <- read_data(csv) agedata$ageTodaySinceBuilt <- age_cal(agedata,"YearBuilt") agedata$ageTodaySinceRenovated <- age_cal(agedata, "YearRenovated") head(agedata)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.