knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The goal of package6 is to encapsulate useful helper functions used for data analysis projects similar to New Taipei City Real Estate Value Prediction.
You can install the development version of package6 from GitHub with:
# install.packages("devtools") devtools::install_github("DSCI-310/DSCI-310-Group6-Package")
library(package6)
format_column_names()
Sometimes raw data set have column names with blank spaces in the column names.
df <- data.frame(`col Name1`= c(1,2),`col Name2` = c("3", "4"), check.names = FALSE) df
it is generally a good idea to remove the blank spaces. This is what base::gsub()
do.
names(df) <- gsub(" ", "_", names(df)) df
Notice how gsub()
requires three arguments and modifies the original data frame.Also the return value is not a data frame. package6::format_column_names()
requires only one argument. It does not modify the original data frame and returns the data frame with formatted column names.
df <- data.frame(`col Name1`= c(1,2),`col Name2` = c("3", "4"), check.names = FALSE) formatted_df <- format_column_names(df) formatted_df
split_data(dataset, train_perc, vs_prec, test_perc)
To split a data set/ date frame into 3 parts:train data, test data and variable selection data
df <- mtcars glimpse(df)
Let's start with mtcars data frame
train<-split_data(mtcars)$train cv<-split_data(mtcars)$cv test<-split_data(mtcars)$test glimpse(train) glimpse(cv) glimpse(test)
cal_rmse(x,y)
To calculate RMSE
predicted = c(12, 5, 19, 3) actual = c(11, 4, 15, 6) cal_rmse(predicted, actual)
| Package | Version | | ------------ | ------- | | tidyverse | 1.3.1 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.