knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

rmiscfun

Travis build status AppVeyor build status Codecov test coverage

The goal of rmiscfun is provide call functions that I use in different projects (at work and personal ones.)

Prerequisites

  1. Install Rtools (Only if you are on Windows): Download and execute the recommended EXE file in this link.
  2. Install devtools package in R:
install.packages("devtools")
  1. Install tidyverse package in R. I use functions from several tidyverse packages, so it is better to have them all installed.
install.packages("tidyverse")

Installation

You can install the current version of rmiscfun from GitHub with:

devtools::install_github("gbasulto/rmiscfun")

Available Functions

| Function | Brief description | | :-------------------- | :------------------------------------------------------------| | glance_data | Summarize both, categorical and numerical variables in a dataframe | | glance_data_in_workbook | Similar to glance_data, but it breaks the summary into types and allows the used to save it in an Excel Workbook | | plot_numerical_vars | Graphical summaries of numerical variables using functions from ggplot2 and GGally| | clean_colnames | Clean column names | | clean_col_content | Clean column content if a variable is character or factor | | interpolate_values | Interpolate values of a variable | | add_missing_columns | Append all the columns not present in a reference vector | | var_imp_plot | Variable importance plot for random forest |

Examples

I am using the Iris dataset in R, which has 5 variables. The first four are measurements 150 flowers and the last column specifies the species (there are 50 flowers of each species).

## Uncomment the following line to read the documentation of the dataset.
## help(iris)

head(iris)

glance_data

## Load package
library(rmiscfun)

## Check documentation
help("glance_data")

## Summarize iris dataset
glance_data(iris)

glance_data_in_workbook

## Load package
library(rmiscfun)

## Check documentation
help("glance_data_in_workbook")

## Summarize iris dataset
glance_data_in_workbook(iris)

## Uncomment the following line to summarize iris dataset AND create Excel Worksheet
## glance_data_in_workbook(iris, "iris_in_excel.xlsx")

plot_numerical_vars

## Load package
library(rmiscfun)

## Check documentation
help("plot_numerical_vars")

plot_numerical_vars(iris, "pairwise")
plot_numerical_vars(iris, "density")
plot_numerical_vars(iris, "boxplot")
plot_numerical_vars(iris, "violin")
plot_numerical_vars(iris, "histogram")
plot_numerical_vars(iris, "qqplot")

clean_colnames

## Load package
library(rmiscfun)

## Check documentation
help("clean_colnames")

input <- c("bart Simpson", "LisaSimpson", "maggie..simpson!",
                 "MARGE-Simpson", "Homer Simpson :-)")
clean_colnames(input)

clean_col_content

library(rmiscfun)

clean_col_content(c("bart Simpson", "LisaSimpson",
                    "maggie..simpson!",
                    "MARGE-Simpson", "Homer Simpson :-)"))

## Get warning for factors.
clean_col_content(
  factor(c("bart  Simpson", "LisaSimpson",
           "maggie..simpson!", "MARGE-Simpson",
           "bart  Simpson", "Homer Simpson :-)"))
)

interpolate_values

library(rmiscfun)

 x <- c(1, 2, 4, 5)
 y <- c(1, 3, 7)
 z <- c("a", "b", "a")
 interpolate_values(x, y, z)

add_missing_columns

library(rmiscfun)

input_df <- data.frame(a = 1:3, b = letters[1:3])

## Reference vector
colnames_vector <- c("b", "c")

## Filler
filler <- -888

## Output vector
add_missing_columns(input_df, colnames_vector, filler)

var_imp_plot

library(randomForest)
library(rmiscfun)

## Fit random forest
rf <- randomForest(Species ~ ., data = iris)

## Display variable importance plot
var_imp_plot(rf)


gbasulto/rmiscfun documentation built on July 25, 2019, 8:56 p.m.