Illustrative data: starwars

The examples below make use of the starwars and storms data from the dplyr package

# some example data
data(starwars, package = "dplyr")
data(storms, package = "dplyr")

For illustrating comparisons of dataframes, use the starwars data and produce two new dataframes star_1 and star_2 that randomly sample the rows of the original and drop a couple of columns.

library(dplyr)
star_1 <- starwars %>% sample_n(50)
star_2 <- starwars %>% sample_n(50) %>% select(-1, -2)

inspect_cor() for a single dataframe

inspect_cor() returns a tibble containing Pearson's correlation coefficient, confidence intervals and $p$-values for pairs of numeric columns . The function combines the functionality of cor() and cor.test() in a more convenient wrapper.

library(inspectdf)
inspect_cor(storms)

A plot showing point estimate and confidence intervals is printed when using the show_plot() function. Note that intervals that straddle the null value of 0 are shown in gray:

inspect_cor(storms) %>% show_plot()

Notes:

inspect_cor() for two dataframes

When a second dataframe is provided, inspect_cor() returns a tibble that compares correlation coefficients of the first dataframe to those in the second. The p_value column contains a measure of evidence for whether the two correlation coefficients are equal or not.

inspect_cor(storms, storms[-c(1:200), ])

To plot the comparison of the top 20 correlation coefficients:

inspect_cor(storms, storms[-c(1:200), ]) %>% 
  slice(1:20) %>%
  show_plot()

Notes:



alastairrushworth/inspectdf documentation built on Aug. 15, 2022, 1:23 a.m.