scatterplot: Scatterplot

View source: R/scatterplot.R

scatterplotR Documentation

Scatterplot

Description

Creates a scatter plot and calculates a correlation between two variables.

Usage

scatterplot(
  data = NULL,
  x_var_name = NULL,
  y_var_name = NULL,
  dot_label_var_name = NULL,
  weight_var_name = NULL,
  alpha = 1,
  annotate_stats = TRUE,
  annotate_y_pos = 5,
  annotated_stats_color = "green4",
  annotated_stats_font_size = 6,
  annotated_stats_font_face = "bold",
  line_of_fit_type = "lm",
  ci_for_line_of_fit = FALSE,
  line_of_fit_color = "blue",
  line_of_fit_thickness = 1,
  dot_color = "black",
  x_axis_label = NULL,
  y_axis_label = NULL,
  dot_size = 2,
  dot_label_size = NULL,
  dot_size_range = c(3, 12),
  jitter_x_percent = 0,
  jitter_y_percent = 0,
  jitter_x_y_percent = 0,
  cap_axis_lines = TRUE,
  color_dots_by = NULL,
  png_name = NULL,
  save_as_png = FALSE,
  width = 16,
  height = 9
)

Arguments

data

a data object (a data frame or a data.table)

x_var_name

name of the variable that will go on the x axis

y_var_name

name of the variable that will go on the y axis

dot_label_var_name

name of the variable that will be used to label individual observations

weight_var_name

name of the variable by which to weight the individual observations for calculating correlation and plotting the line of fit

alpha

opacity of the dots (0 = completely transparent, 1 = completely opaque)

annotate_stats

if TRUE, the correlation and p-value will be annotated at the top of the plot (default = TRUE)

annotate_y_pos

position of the annotated stats, expressed as a percentage of the range of y values by which the annotated stats will be placed above the maximum value of y in the data set (default = 5). If annotate_y_pos = 5, and the minimum and maximum y values in the data set are 0 and 100, respectively, the annotated stats will be placed at 5% of the y range (100 - 0) above the maximum y value, y = 0.05 * (100 - 0) + 100 = 105.

annotated_stats_color

color of the annotated stats (default = "green4").

annotated_stats_font_size

font size of the annotated stats (default = 6).

annotated_stats_font_face

font face of the annotated stats (default = "bold").

line_of_fit_type

if line_of_fit_type = "lm", a regression line will be fit; if line_of_fit_type = "loess", a local regression line will be fit; if line_of_fit_type = "none", no line will be fit

ci_for_line_of_fit

if ci_for_line_of_fit = TRUE, confidence interval for the line of fit will be shaded

line_of_fit_color

color of the line of fit (default = "blue")

line_of_fit_thickness

thickness of the line of fit (default = 1)

dot_color

color of the dots (default = "black")

x_axis_label

alternative label for the x axis

y_axis_label

alternative label for the y axis

dot_size

size of the dots on the plot (default = 2)

dot_label_size

size for dots' labels on the plot. If no input is entered for this argument, it will be set as dot_label_size = 5 by default. If the plot is to be weighted by some variable, this argument will be ignored, and dot sizes will be determined by the argument dot_size_range

dot_size_range

minimum and maximum size for dots on the plot when they are weighted

jitter_x_percent

horizontally jitter dots by a percentage of the range of x values.

jitter_y_percent

vertically jitter dots by a percentage of the range of y values

jitter_x_y_percent

horizontally and vertically jitter dots by a percentage of the range of x and y values.

cap_axis_lines

logical. Should the axis lines be capped at the outer tick marks? (default = TRUE)

color_dots_by

name of the variable that will determine colors of the dots

png_name

name of the PNG file to be saved. By default, the name will be "scatterplot_" followed by a timestamp of the current time. The timestamp will be in the format, jan_01_2021_1300_10_000001, where "jan_01_2021" would indicate January 01, 2021; 1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 would indicate 10.000001 seconds after the hour.

save_as_png

if save = TRUE, the plot will be saved as a PNG file.

width

width of the plot to be saved. This argument will be directly entered as the width argument for the ggsave function within ggplot2 package (default = 16)

height

height of the plot to be saved. This argument will be directly entered as the height argument for the ggsave function within ggplot2 package (default = 9)

Details

If a weighted correlation is to be calculated, the following package(s) must be installed prior to running the function: Package 'weights' v1.0 (or possibly a higher version) by John Pasek (2018), https://cran.r-project.org/package=weights

Value

the output will be a scatter plot, a ggplot object.

Examples

## Not run: 
scatterplot(data = mtcars, x_var_name = "wt", y_var_name = "mpg")
scatterplot(
  data = mtcars, x_var_name = "wt", y_var_name = "mpg",
  dot_label_var_name = "hp", weight_var_name = "drat",
  annotate_stats = TRUE)
scatterplot(
  data = mtcars, x_var_name = "wt", y_var_name = "mpg",
  dot_label_var_name = "hp", weight_var_name = "cyl",
  dot_label_size = 7, annotate_stats = TRUE)

## End(Not run)

kim documentation built on Oct. 9, 2023, 5:08 p.m.