scatter: Summarize inferences using scatterplots

Description Usage Arguments Details Value scatter-specific parameters Author(s) See Also Examples

Description

Initializes a scatterplot aimed at summarizing inferences from regression models. This plot may: include confidence intervals, perhaps created from simulations; be clipped to the convex hull to avoid unwarranted extrapolation; and include simple linear or robust fits to the data. If you simply want to draw points on a tile plot, use pointsTile instead.

Usage

1

Arguments

...

Any number of arguments given below. Must include exactly one horizontal dimension (x or top) and exactly one vertical dimension (y or right). All inputs should be identified by appropriate tags; i.e., use scatter(x=myxvar, y=myyvar), not scatter(myxvar,myyvar)

Details

This function does no plotting; instead, it creates a scatter object, or trace of plotting data, to be drawn on one or more plots in a tiled arrangement of plots. To complete the drawing include the scatter object as an input to tile, from which users can set further options including plot and axis titles, axis scaling and titles.

scatter offers many data processing and formatting options for the trace to be plotted. Confidence intervals (shown as horizontal or vertical lines, or both) can be calculated from simulations or posterior draws, or may be provided by the user. Alternatively, scatter can add simple fit lines and confidence intervals to the plotted data (e.g., a linear, robust, or loess fit). Optionally, results outside the convex hull of the original data can be hidden or flagged. Finally, the graphical parameters for each element of the scatter (including symbols, confidence intervals, or text) can be adjusted, often on a point-by-point basis.

Run through tile, output from scatter will yield a finished plot. The plot cannot be easily modified after creation. Rather, users should include in the initial call to tile additional traces for all desired annotations (text, symbols, lines, or polygons) to the finished plot.

Value

A scatter object, used only as an input to tile.

scatter-specific parameters

A call to scatter must provide an orthogonal pair of the following inputs:

x

coordinate vector of data to plot, attached to the x axis. x may be plotted directly, or treated as simulation data to summarize (see parameter simulates below).

y

coordinate vector of data to plot, attached to the y axis; may be simulation data.

top

coordinate vector of data to plot, attached to the top axis; may be simulation data.

right

coordinate vector of data to plot, attached to the right axis; may be simulation data.

The following inputs are all optional, and control the major features of scatter. It is usually best to use either ci or fit, but not both.

xlower

vector of same length as x containing user-provided lower bounds; only used when simulates is NULL

xupper

vector of same length as x containing user-provided upper bounds; only used when simulates is NULL

ylower

vector of same length as y containing user-provided lower bounds; only used when simulates is NULL

yupper

vector of same length as y containing user-provided upper bounds; only used when simulates is NULL

toplower

vector of same length as top containing user-provided lower bounds; only used when simulates is NULL

topupper

vector of same length as top containing user-provided upper bounds; only used when simulates is NULL

rightlower

vector of same length as right containing user-provided lower bounds; only used when simulates is NULL

rightupper

vector of same length as right containing user-provided upper bounds; only used when simulates is NULL

simulates

A string identifying one of the variables (x, y, top, or right) as simulation data (by default is NULL, for no simulation data). If simulates is set to one of the plot dimensions, the orthogonal dimension will be treated as scenario code grouping the simulations. For example, to plot summaries of 1,000 simulates drawn from the conditional distribution of the response variable y for each of 5 different values of a particular covariate, stack all 5,000 simulates in a single vector y, then create a corresponding 5,000-vector x listing the values of x used to create each simulate. scatter will then calculate confidence intervals each scenario, as requested in ci below.

plot

scalar or vector, the plot(s) in which this trace will be drawn; defaulting to the first plot. Plots are numbered consecutively from the top left, row-by-row. Thus in a 2 x 3 tiling, the first plot in the second row is plot number 4.

ci

list, parameters governing the appearance and calculation of confidence intervals from data in lower and upper or provided by the simulations defined in simulates:

levels

scalar or vector of desired confidence intervals to calculate from the variable named by simulates; ignored if user provides bounds in lower and upper. Default is 0.95, which gives approximately 2-standard error bounds.

mark

vector of desired plotting styles for confidence intervals. The default and only current option is lines.

fit

list, parameters governing the appearance and calculation of simple fits to the two plotted dimensions:

method

The type of fit to apply: linear (default) fits a bivariate linear regression; wls fits a weighted linear regression; robust fits a robust regression using an M-estimator; mmest fits a robust regression using an MM-estimator; loess fits a loess smoother fits a loess smoother.

ci

vector of requested levels of confidence intervals for best fit line; default is 0.95. Set to NA for no confidence intervals.

mark

vector of desired plotting styles for confidence intervals (either shaded regions or dashed lines) for best fit line; default is shaded.

col

color of best fit line; default is black.

span

bandwith parameter for loess; default is 0.95.

weights

vector of weights for wls fits.

extrapolate

list, parameters governing the plotting of extrapolation outside the convex hull of the covariate data, using whatif in the WhatIf package:

formula

optional formula object, used to specify the estimated model. Useful if the model contains functions of the covariates given in data below

data

matrix or dataframe, the actual values of all covariates used to estimate the model (omit the constant and response variable)

cfact

matrix or dataframe, the counterfactual values of all the covariates (omit the constant and response variable), one row for each scenario. The order of colums must match data, and the order of rows must match the order of the scenarios. If scenarios are calculated from simulates, then the rows must be listed from the scenario with the smallest factor level to the highest

omit.extrapolated

If TRUE (the default), then the plotted trace and CIs are clipped to the convex hull; if FALSE, then extrapolation outside the convex hull is printed in a lighter color or with dashed or dotted lines.

labelsxoffset

Scalar, horizontal offset for text labels. Default is 0.

labelsyoffset

Scalar, vertical offset for text labels. Default is 0.

In addition to these scatter-specific parameters, users may provide any of the generic tile parameters documented in pointsTile.

Author(s)

Christopher Adolph cadolph@u.washington.edu

See Also

tile, pointsTile

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# Example:  Duncan's Occupational Prestige Data

# Load data
require(car)
data(Duncan)
attach(Duncan)

# Convert job classes to numerical codes
jobclass <- (type=="prof")*1 + (type=="wc")*2 + (type=="bc")*3

# Make some nice colors for job classes (not run)
# require(RColorBrewer)
# col <- brewer.pal(3, "Dark2")

# The colors brewer.pal would produce in this case:
col <- c("#1B9E77", "#D95F02", "#7570B3")

# Pick some symbols for job classes 
# (run example(points) for meaning of symbol codes)
pch <- c(17, 15, 16)

# Create labels, symbols, and colors for points
labels <- as.vector(row.names(Duncan))
markers <- pch[jobclass]
colors <- col[jobclass]

# Create scatterplot trace
prestigeXeducation <- scatter(x = education,
                              y = prestige,
                              labels = labels,
                              pch = markers,
                              col = colors,
                              size = 1,
                              cex = 0.75,
                              labelsyoffset = -0.035,
                              plot = 1,
                              fit = list(method="mmest")
                              )

# Create legend traces
legendtitle <- textTile(labels="1950 US Occupations (Duncan, 1961)",
                        x=20, y=98,
                        col="black",
                        fontface="bold",
                        cex = 0.75,
                        plot = 1
                        )

legendsymbols <- pointsTile(x=  c(2,      2,       2),
                            y=  c(88,     82,      77),
                            pch = pch,
                            col = col,
                            size = 1,
                            cex = 0.75,
                            plot=1
                            )

legendlabels <- textTile(labels=c("Professional",
                                  "White collar",
                                  "Blue collar"),
                         x=  c(11,      11,       11),
                         y=  c(88,     82,      77),
                         pch= pch,
                         col= col,
                         cex = 0.75,
                         plot=1
                         )

# Create rug traces
xrug <- rugTile(x=education, type="dots", plot=1)

yrug <- rugTile(y=prestige, type="dots", plot=1)

# Plot all traces using tile
tile(prestigeXeducation,
     legendtitle,
     legendsymbols,
     legendlabels,
     xrug, yrug,

     width = list(null=5),      # widen plot area for visibility
     #output = list(file="ScatterplotExample"),
     limits = c(0,100,0,100),
     xaxistitle = list(labels=
                      "Income (% of males making > $3500 in $1950)"),
     yaxistitle = list(labels=
                      "Prestige (% rated good or excellent by survey)"),
     height=list(plot="golden")
     )

chrisadolph/tileForShiny documentation built on Feb. 6, 2022, 12:34 a.m.