The goal of rmiscfun is provide call functions that I use in different projects (at work and personal ones.)
devtools
package in R
:install.packages("devtools")
tidyverse
package in R
. I use functions from several
tidyverse packages, so it is better to have them all installed.install.packages("tidyverse")
You can install the current version of rmiscfun from GitHub with:
devtools::install_github("gbasulto/rmiscfun")
| Function | Brief description |
| :------------------------ | :----------------------------------------------------------------------------------------------------------------- |
| glance_data
| Summarize both, categorical and numerical variables in a dataframe |
| glance_data_in_workbook
| Similar to glance_data
, but it breaks the summary into types and allows the used to save it in an Excel Workbook |
| plot_numerical_vars
| Graphical summaries of numerical variables using functions from ggplot2
and GGally
|
| clean_colnames
| Clean column names |
| clean_col_content
| Clean column content if a variable is character or factor |
| interpolate_values
| Interpolate values of a variable |
| add_missing_columns
| Append all the columns not present in a reference vector |
| var_imp_plot
| Variable importance plot for random forest |
I am using the Iris dataset in R, which has 5 variables. The first four are measurements 150 flowers and the last column specifies the species (there are 50 flowers of each species).
## Uncomment the following line to read the documentation of the dataset.
## help(iris)
head(iris)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3.0 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
glance_data
## Load package
library(rmiscfun)
## Check documentation
help("glance_data")
## Summarize iris dataset
glance_data(iris)
glance_data_in_workbook
## Load package
library(rmiscfun)
## Check documentation
help("glance_data_in_workbook")
## Summarize iris dataset
glance_data_in_workbook(iris)
## Uncomment the following line to summarize iris dataset AND create Excel Worksheet
## glance_data_in_workbook(iris, "iris_in_excel.xlsx")
plot_numerical_vars
## Load package
library(rmiscfun)
## Check documentation
help("plot_numerical_vars")
plot_numerical_vars(iris, "pairwise")
plot_numerical_vars(iris, "density")
plot_numerical_vars(iris, "boxplot")
plot_numerical_vars(iris, "violin")
plot_numerical_vars(iris, "histogram")
plot_numerical_vars(iris, "qqplot")
clean_colnames
## Load package
library(rmiscfun)
## Check documentation
help("clean_colnames")
input <- c("bart Simpson", "LisaSimpson", "maggie..simpson!",
"MARGE-Simpson", "Homer Simpson :-)")
clean_colnames(input)
clean_col_content
library(rmiscfun)
clean_col_content(c("bart Simpson", "LisaSimpson",
"maggie..simpson!",
"MARGE-Simpson", "Homer Simpson :-)"))
## Get warning for factors.
clean_col_content(
factor(c("bart Simpson", "LisaSimpson",
"maggie..simpson!", "MARGE-Simpson",
"bart Simpson", "Homer Simpson :-)"))
)
interpolate_values
library(rmiscfun)
x <- c(1, 2, 4, 5)
y <- c(1, 3, 7)
z <- c("a", "b", "a")
interpolate_values(x, y, z)
add_missing_columns
library(rmiscfun)
input_df <- data.frame(a = 1:3, b = letters[1:3])
## Reference vector
colnames_vector <- c("b", "c")
## Filler
filler <- -888
## Output vector
add_missing_columns(input_df, colnames_vector, filler)
var_imp_plot
library(randomForest)
library(rmiscfun)
## Fit random forest
rf <- randomForest(Species ~ ., data = iris)
## Display variable importance plot
var_imp_plot(rf)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.