knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

package6

R-CMD-check

codecov


The goal of package6 is to encapsulate useful helper functions used for data analysis projects similar to New Taipei City Real Estate Value Prediction.


Installation

You can install the development version of package6 from GitHub with:

# install.packages("devtools")
devtools::install_github("DSCI-310/DSCI-310-Group6-Package")


Usage

library(package6)

format_column_names()

Sometimes raw data set have column names with blank spaces in the column names.

df <- data.frame(`col Name1`= c(1,2),`col Name2` = c("3", "4"),  check.names = FALSE)
df

it is generally a good idea to remove the blank spaces. This is what base::gsub() do.

names(df) <- gsub(" ", "_", names(df))
df

Notice how gsub() requires three arguments and modifies the original data frame.Also the return value is not a data frame. package6::format_column_names() requires only one argument. It does not modify the original data frame and returns the data frame with formatted column names.

df <- data.frame(`col Name1`= c(1,2),`col Name2` = c("3", "4"),  check.names = FALSE)
formatted_df <- format_column_names(df)
formatted_df

split_data(dataset, train_perc, vs_prec, test_perc)

To split a data set/ date frame into 3 parts:train data, test data and variable selection data

df <- mtcars
glimpse(df)

Let's start with mtcars data frame

train<-split_data(mtcars)$train
cv<-split_data(mtcars)$cv
test<-split_data(mtcars)$test
glimpse(train)
glimpse(cv)
glimpse(test)

cal_rmse(x,y)

To calculate RMSE

predicted = c(12, 5, 19, 3)
actual = c(11, 4, 15, 6)

cal_rmse(predicted, actual)


Dependencies

| Package | Version | | ------------ | ------- | | tidyverse | 1.3.1 |


License

License: MIT



DSCI-310/DSCI-310-Group-6-Package documentation built on April 21, 2022, 3:55 a.m.