knitr::opts_chunk$set(error = TRUE)
Load in the required packages needed for this R package.
library(ggplot2) library(tidyverse)
Now, install this R package:
devtools::install_github("kateparrish9887/R_package_Parrish") library(ParrishRPackage)
Let's download some data for our package:
download.file("https://raw.githubusercontent.com/kateparrish9887/R_package_Parrish/master/data/Butterfly_data.csv", "/cloud/project/data/Butterfly_data.csv") download.file("https://raw.githubusercontent.com/kateparrish9887/R_package_Parrish/master/data/FossilAnts.csv", "/cloud/project/data/FossilAnts.csv") download.file("https://raw.githubusercontent.com/kateparrish9887/R_package_Parrish/master/data/combined.csv", "/cloud/project/data/combined.csv")
To make the working examples work smoother, let's load in the data frames and give them each a name:
butterfly <- read_csv("/cloud/project/data/Butterfly_data.csv") ants <- read_csv("/cloud/project/data/FossilAnts.csv") combined <- read_csv("/cloud/project/data/combined.csv")
This function allows you to change non-NA values if applicable, then it drops those NA values from a data set in one function. If your data set already contains NA values, then this function can simply be used to remove those values.
This is useful for data cleanup, especially if you have plans of plotting the data set. This function gives you a data set with no NA values, and some examples are below.
library(ParrishRPackage) change_na_rm(ants, "None")
library(ParrishRPackage) change_na_rm(combined)
This function can also be used to strategically remove a column if needed. This is useful if a column in your data set was not recorded correctly, or if it simply needs to be removed for further data cleanup.
library(ParrishRPackage) change_na_rm(butterfly, "Plebejus icarioides")
This function plots a line range graph by using the 'ggplot2' package.
This function may be useful for data set visualization for observations that contain a minimum and a maximum range of values, such as minimum and maximum temperature, or minimum and maximum fossil age. Both of these examples are below. This function gives a line range plot, with a minimum and maximum range on the y-axis for each x-axis value.
library(ParrishRPackage) line_range_plot(butterfly, ButterflySpecies, SpringTemp, SummerTemp)
Additional functions from this package like 'change_na_rm' can be used for data cleanup prior to plotting the line range plot. A y-axis label was added in the example for clarification as well.
library(ParrishRPackage) ants %>% change_na_rm("None") %>% line_range_plot(Tribe, min_ma, max_ma) + labs(y = "Age (in Ma)")
This function is for plotting a linear regression model by using the 'ggplot2' package.
This function can be useful for data sets that are hypothesized to have a linear relationship between a numerical response and a numerical predictor. It can also be used to plot a linear model with an additional categorical predictor with the categorical predictor and the numeric response. This function produces a linear model scatter plot and includes a colored trend line.
If this function is used for both a numerical and categorical predictor with the numerical response, the categorical predictor will be represented by color on the scatter plot. An example of this is below.
library(ParrishRPackage) lm_num_cat(ants, min_ma, max_ma, subfamily)
If a categorical predictor is not needed, then only a numerical response and predictor can be put in the arguments of the function.
The 'change_na_rm' function that is also in this package can also be used with this function to both change and remove NA values from the data set, then plot a linear regression plot.
library(ParrishRPackage) ants %>% change_na_rm("None") %>% lm_num_cat(min_ma, max_ma)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.