aggregate.plot: Plot summary statistics of a numeric variable by group

Description Usage Arguments Details Author(s) See Also Examples

View source: R/epiDisplay.R

Description

Split a numeric variable into subsets, plot summary statistics for each

Usage

1
2
3
4
5
6
7
## S3 method for class 'plot'
aggregate(x, by, grouping = NULL, FUN = c("mean", "median"), 
    error = c("se", "ci", "sd", "none"), alpha = 0.05, lwd = 1, 
    lty = "auto", line.col = "auto", bin.time = 4, bin.method = c("fixed", 
        "quantile"), legend = "auto", legend.site = "topright", 
    legend.bg = "white", xlim = "auto", ylim = "auto", bar.col = "auto", 
    cap.size = 0.02, lagging = 0.007, main = "auto", return.output = FALSE, ...) 

Arguments

x

a numeric variable

by

a list of grouping elements for the bar plot, or a single numeric or integer variable which will form the X axis for the time line graph

grouping

further stratification variable for the time line graph

FUN

either "mean" or "median"

error

statistic to use for error lines (either 'se' or 'sd' for barplot, or 'ci' or 'none' for time line graph). When FUN = "median", can only be 'IQR' (default) or 'none'.

alpha

level of significance for confidence intervals

lwd

relative width of the "time" lines. See 'lwd' in ?par

lty

type of the "time" lines. See 'lty' in ?par

line.col

colour(s) of the error and time lines

bin.time

number bins in the time line graph

bin.method

method to allocate the "time" variable into bins, either with 'fixed' interval or equally distributed sample sizes based on quantiles

legend

presence of automatic legend for the time line graph

legend.site

a single character string indicating location of the legend. See details of ?legend

legend.bg

background colour of the legend

xlim

X axis limits

ylim

Y axis limits

bar.col

bar colours

cap.size

relative length of terminating cross-line compared to the range of X axis

lagging

lagging value of the error bars of two adjecant categories at the same time point. The value is result of dividing this distance with the range of X axis

main

main title of the graph

return.output

whether the dataframe resulted from aggregate should be returned

...

additional graphic parameters passed on to other methods

Details

This function plots aggregated values of 'x' by a factor (barplot) or a continuous variable (time line graph).

When 'by' is of class 'factor', a bar plot with error bars is displayed.

When 'by' is a continuous variable (typically implying time), a time line graph is displayed.

Both types of plots have error arguments. Choices are 'se' and 'sd' for the bar plot and 'ci' and IQR for both bar plot and time line graph. All these can be suppressed by specifying 'error'="none".

'bin.time' and 'bin.method' are exclusively used when 'by' is a continuous variable and does not have regular values (minimum frequency of 'by' <3). This condition is automatically and silently detected by 'aggregate.plot' before 'bin.method' chooses the method for aggregation and bin.time determines the number of bins.

If 'legend = TRUE" (by default), a legend box is automatically drawn on the "topright" corner of the graph. This character string can be changed to others such as, "topleft", "center", etc (see examples).

'cap.size' can be assigned to zero to remove the error bar cap.

Author(s)

Virasakdi Chongsuvivatwong <[email protected]>

See Also

'aggregate.data.frame', 'aggregate.numeric', 'tapply'

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
data(Compaq)
.data <- Compaq
attach(.data)
aggregate.plot(x=year, by=list(HOSPITAL = hospital, STAGE = stage), 
   return = TRUE)
# moving legend and chaging bar colours
aggregate.plot(x=year, by=list(HOSPITAL = hospital, STAGE = stage), error="ci",
  legend.site = "topleft", bar.col = c("red","blue"))
detach(.data)

# Example with regular time intervals (all frequencies > 3)
data(Sitka, package="MASS")
.data <- Sitka
attach(.data)
tab1(Time, graph=FALSE) # all frequencies > 3
aggregate.plot(x=size, by=Time, cap.size = 0) # Note no cap on error bars
# For black and white presentation
aggregate.plot(x=size, by=Time, grouping=treat, FUN="median", 
  line.col=3:4, lwd =2)
detach(.data)

# Example with irregular time intervals (some frequencies < 3)
data(BP)
.data <- BP
attach(.data) 
des(.data)
age <- as.numeric(as.Date("2008-01-01") - birthdate)/365.25
aggregate.plot(x=sbp, by=age, grouping=saltadd, bin.method="quantile")
aggregate.plot(x=sbp, by=age, grouping=saltadd, lwd=3, 
  line.col=c("blue","green") , main = NULL)
title(main="Effect of age and salt adding on SBP", xlab="years",ylab="mm.Hg")
points(age[saltadd=="no"], sbp[saltadd=="no"], col="blue")
points(age[saltadd=="yes"], sbp[saltadd=="yes"], pch=18, col="green")
detach(.data)
rm(list=ls())

## For a binary outcome variable, aggregrated probabilities is computed
data(Outbreak)
.data <- Outbreak
attach(.data)
.data$age[.data$age == 99] <- NA
detach(.data)
attach(.data)
aggregate.plot(diarrhea, by=age, bin.time=5)
diarrhea1 <- factor(diarrhea)
levels(diarrhea1) <- c("no","yes")
aggregate.plot(diarrhea1, by=age, bin.time=5)
detach(.data)
rm(list=ls())

Example output

Loading required package: foreign
Loading required package: survival
Loading required package: MASS
Loading required package: nnet
          HOSPITAL   STAGE     mean        se
1  Public hospital Stage 1 5.340593 0.1285291
2 Private hospital Stage 1 5.689674 0.1617818
3  Public hospital Stage 2 4.483727 0.1223000
4 Private hospital Stage 2 5.153961 0.2727543
5  Public hospital Stage 3 1.619702 0.1938394
6 Private hospital Stage 3 2.011316 0.6975537
7  Public hospital Stage 4 2.736775 0.5203789
8 Private hospital Stage 4 5.823433 0.6592239
Time : 
        Frequency Percent Cum. percent
152            79      20           20
174            79      20           40
201            79      20           60
227            79      20           80
258            79      20          100
  Total       395     100          100
cross-sectional survey on BP & risk factors 
 No. of observations =  100 
  Variable      Class           Description        
1 id            integer         id                 
2 sex           factor          sex                
3 sbp           integer         Systolic BP        
4 dbp           integer         Diastolic BP       
5 saltadd       factor          Salt added on table
6 birthdate     Date                               

epiDisplay documentation built on May 11, 2018, 1:04 a.m.