While the other vignette shows you how to use perccalc
appropriately, there are instances where there's just too few categories to estimate percentiles properly. Imagine estimating a distribution of 1:100
percentiles with only three ordered categories, it just sounds too far fetched.
Let's load our packages.
library(perccalc) library(dplyr) library(ggplot2)
For example, take the survey
data on smoking habits.
smoking_data <- MASS::survey %>% # you will need to install the MASS package as_tibble() %>% select(Sex, Smoke, Pulse) %>% rename( gender = Sex, smoke = Smoke, pulse_rate = Pulse )
The final results is this dataset:
smoking_data %>% arrange(pulse_rate)
Note that there's only four categories in the smoke
variable. Let's try to estimate the percentile difference.
smoking_data <- smoking_data %>% mutate(smoke = factor(smoke, levels = c("Never", "Occas", "Regul", "Heavy"), ordered = TRUE)) perc_diff(smoking_data, smoke, pulse_rate)
perc_diff
returns the estimated coefficient but also warns you that it's difficult for the function to estimate the standard error. This happens similarly for perc_dist
.
perc_dist(smoking_data, smoke, pulse_rate) %>% head()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.