normality | R Documentation |
The normality() performs Shapiro-Wilk test of normality of numerical values.
normality(.data, ...)
## S3 method for class 'data.frame'
normality(.data, ..., sample = 5000)
## S3 method for class 'grouped_df'
normality(.data, ..., sample = 5000)
.data |
a data.frame or a |
... |
one or more unquoted expressions separated by commas. You can treat variable names like they are positions. Positive values select variables; negative values to drop variables. If the first expression is negative, normality() will automatically start with all variables. These arguments are automatically quoted and evaluated in a context where column names represent column positions. They support unquoting and splicing. |
sample |
the number of samples to perform the test. See vignette("EDA") for an introduction to these concepts. |
This function is useful when used with the group_by
function of the dplyr package. If you want to test by level of the categorical
data you are interested in, rather than the whole observation,
you can use group_tf as the group_by function.
This function is computed shapiro.test
function.
An object of the same class as .data.
The information derived from the numerical data test is as follows.
statistic : the value of the Shapiro-Wilk statistic.
p_value : an approximate p-value for the test. This is said in Roystion(1995) to be adequate for p_value < 0.1.
sample : the number of samples to perform the test. The number of observations supported by the stats::shapiro.test function is 3 to 5000.
normality.tbl_dbi
, diagnose_numeric.data.frame
, describe.data.frame
, plot_normality.data.frame
.
# Normality test of numerical variables
normality(heartfailure)
# Select the variable to describe
normality(heartfailure, platelets, sodium, sample = 200)
# death_eventing dplyr::grouped_dt
library(dplyr)
gdata <- group_by(heartfailure, smoking, death_event)
normality(gdata, "platelets")
normality(gdata, sample = 250)
# Positive values select variables
heartfailure %>%
normality(platelets, sodium)
# death_eventing pipes & dplyr -------------------------
# Test all numerical variables by 'smoking' and 'death_event',
# and extract only those with 'smoking' variable level is "No".
heartfailure %>%
group_by(smoking, death_event) %>%
normality() %>%
filter(smoking == "No")
# extract only those with 'sex' variable level is "Male",
# and test 'platelets' by 'smoking' and 'death_event'
heartfailure %>%
filter(sex == "Male") %>%
group_by(smoking, death_event) %>%
normality(platelets)
# Test log(platelets) variables by 'smoking' and 'death_event',
# and extract only p.value greater than 0.01.
heartfailure %>%
mutate(platelets_income = log(platelets)) %>%
group_by(smoking, death_event) %>%
normality(platelets_income) %>%
filter(p_value > 0.01)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.