scatterplot: Scatterplot for Continuous Variables

View source: R/scatterplot.R

scatterplotR Documentation

Scatterplot for Continuous Variables

Description

Produce a matrix of plot for continuous variables: scatterplots, histograms, correlation and missing values. Inspired from the ggpairs function of the R package GGally.

Usage

scatterplot(
  data,
  formula,
  columns,
  format = NULL,
  group = NULL,
  transform = NULL,
  facet = "grid",
  alpha.point = 1,
  type.diag = "boxplot",
  bins = NULL,
  position.bar = "identity",
  linewidth.density = NULL,
  alpha.area = NULL,
  method.cor = "pearson",
  name.cor = "r",
  size.cor = NULL,
  digits = c(3, 2),
  display.NA = NULL,
  color = NULL,
  xlim = NULL,
  ylim = NULL,
  size.axis = NULL,
  size.legend = NULL,
  size.facet = NULL
)

Arguments

data

[data.frame] dataset containing the variables to be displayed.

formula

[formula] formula indicating the variables to be used (outcome~time|id). Long format only.

columns

[character vector] Columns whose numerical values are to be displayed. Wide format only.

format

[character] Is the dataset in the long ("long") or wide ("wide") format?

group

[character] optional group variable used to color the points, stratify the histogram/density and correlation.

transform

[character or function] optional transformation to be applied on the outcome.

facet

[character] whether to use ggplot:::facet_grid ("grid") or ggh4x::facet_grid2 ("grid2").

alpha.point

[numeric] the transparency level used to display the points in the scatterplot.

type.diag

[character] type of graphical display on the diagonal: "boxplot", "histogram", or "density".

bins

[character or numeric vector] algorithm or values or number of values used to create the histogram cells. When using facet="grid2" and density=TRUE a character of length two indicating the bandwith and the kernel to be used. See ggplot2::stat_density.

position.bar

[character] passed to geom_histogram (argument position). Only relevant when having multiple groups and using ggh4x::facet_grid2.

linewidth.density

[numeric,>0] width of the lines on the density plot.

alpha.area

[numeric, 0-1] the transparency level used to display the area under the density curve or histogram.

method.cor

[character] estimator of the correlation. Argument passed to stats::cor. When NA, the correlation is not displayed.

name.cor

[character] character used to represent the correlation. By default "r" but can be changed to "\u03C1" to display the greek letter \rho.

size.cor

[numeric,>0] size of the font used to display the correlation or information about missing values.

digits

[numeric of length 2] number of digits used to display the correlation or round the percentage of missing values.

display.NA

[0:2 or "only"] Should the number of missing values be displayed. When taking value 2, will also display the percentage of missing values.

color

[character vector] color used to display the values for each group.

xlim

[numeric,>0 or "common"] range of the x-axis.

ylim

[numeric,>0 or "common"] range of the y-axis.

size.axis

[numeric,>0] size of the font used to display the tick labels.

size.legend

[numeric,>0] size of the font used to display the legend. Can have a second element to control the size of the legend key.

size.facet

[numeric,>0] size of the font used to display the facets (row and column names).

Details

In the long format, the outcome variable contains the numerical values to be displayed. The time variable will be used to spit outcome and display each split separately or jointly with one other split. The identifier links the outcome values across time.

Value

a list of ggplot objects (facet="grid") or a ggplot object (facet="grid2")

Examples

data(gastricbypassL, package = "LMMstar")
gastricbypassL$group <- as.numeric(gastricbypassL$id) %% 3
data(gastricbypassW, package = "LMMstar")

## single group (wide or long format)
scatterplot(gastricbypassL, formula = weight~time|id)
scatterplot(gastricbypassW, columns = paste0("weight",1:4))

## Not run: 
## use histogram instead of boxplot
scatterplot(gastricbypassL, formula = weight~time|id, type.diag = "hist")
scatterplot(gastricbypassL, formula = weight~time|id, type.diag = "hist", bins = 15)

## same scale
scatterplot(gastricbypassL, formula = weight~time|id,
            xlim = "common", ylim = "common")

## transform outcome
scatterplot(gastricbypassL, formula = weight~time|id, transform = "log")

## handling missing values
scatterplot(gastricbypassL, formula = glucagonAUC~time|id)

## coloring per group
scatterplot(gastricbypassL, formula = weight~time|id, group = "group")

## only display NAs
scatterplot(gastricbypassL, formula = glucagonAUC~time|id,
            display.NA = "only", group = "group")
scatterplot(gastricbypassL, formula = glucagonAUC~time|id,
            display.NA = "only", group = "group", size.legend = c(15,2))

## End(Not run)

LMMstar documentation built on Nov. 9, 2023, 1:06 a.m.