This checkdata
data package is built to evaluate whether a vector is normally distributed by two methods:
1. Visualization
- Histogram
- QQ-plot
2. The Shapiro-Wilk test
- by looking at the p-value based on the user-defined level of significance
Create a new project in Rstudio and specify the project type as a package in the local Rstudio environment.
Generate a README file by use_readme_md()
Generate a LICENSE file by the use_mit_license("Shuyi Tan")
Wrap my normarlity_test() function into a R script named normality.R and save it in the R folder.
Generate a test folder by the function use_test("my-test")
Make a vignette by usethis::use_vignette("Vignette")
Document the files by devtools::document()
Create a remote repository in github.com and named as "checkdata"
Run the following command to push the local package framework to the "checkdata" repo
git init_
git add .
git commit -m "initial commit"
git remote add origin https://github.com/yelselmiao/checkdata
and git remote -v
git push -f origin master
The package can be installed by running the following line:
devtools::install_github("yelselmiao/checkdata")`
library(checkdata)
For example, if we would like to check if the mpg column in the mtcars dataset is normal, we may check it like this:
data("mtcars")
normarlity_test(mtcars$mpg)
#> "You data is normal,because your p_value = 0.122881358539443 > 0.05"
You can observe that the density plot is approximately overlapped with the standard normal density plot, and most of the points lie on the diagonal line in the qqplot, now we can roughly infer that the mpg column is normally distributed. Besides, our inference is furthered strengthened by the result of the Shapiro-Wilk test.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.