summaryP | R Documentation |
summaryP
produces a tall and thin data frame containing
numerators (freq
) and denominators (denom
) after
stratifying the data by a series of variables. A special capability
to group a series of related yes/no variables is included through the
use of the ynbind
function, for which the user specials a final
argument label
used to label the panel created for that group
of related variables.
If options(grType='plotly')
is not in effect,
the plot
method for summaryP
displays proportions as a
multi-panel dot chart using the lattice
package's dotplot
function with a special panel
function. Numerators and
denominators of proportions are also included as text, in the same
colors as used by an optional groups
variable. The
formula
argument used in the dotplot
call is constructed,
but the user can easily reorder the variables by specifying
formula
, with elements named val
(category levels),
var
(classification variable name), freq
(calculated
result) plus the overall cross-classification variables excluding
groups
. If options(grType='plotly')
is in effect, the
plot
method makes an entirely different display using
Hmisc::dotchartpl
with plotly
if marginVal
is
specified, whereby a stratification
variable causes more finely stratified estimates to be shown slightly
below the lines, with smaller and translucent symbols if data
has been run through addMarginal
. The marginal summaries are
shown as the main estimates and the user can turn off display of the
stratified estimates, or view their details with hover text.
The ggplot
method for summaryP
does not draw numerators
and denominators but the chart is more compact than using the
plot
method with base graphics because ggplot2
does not repeat category names the same way as lattice
does.
Variable names that are too long to fit in panel strips are renamed
(1), (2), etc. and an attribute "fnvar"
is added to the result;
this attribute is a character string defining the abbreviations,
useful in a figure caption. The ggplot2
object has
label
s for points plotted, used by plotly::ggplotly
as
hover text (see example).
The latex
method produces one or more LaTeX tabular
s
containing a table representation of the result, with optional
side-by-side display if groups
is specified. Multiple
tabular
s result from the presence of non-group stratification
factors.
summaryP(formula, data = NULL, subset = NULL,
na.action = na.retain, sort=TRUE,
asna = c("unknown", "unspecified"), ...)
## S3 method for class 'summaryP'
plot(x, formula=NULL, groups=NULL,
marginVal=NULL, marginLabel=marginVal,
refgroup=NULL, exclude1=TRUE, xlim = c(-.05, 1.05),
text.at=NULL, cex.values = 0.5,
key = list(columns = length(groupslevels), x = 0.75,
y = -0.04, cex = 0.9,
col = lattice::trellis.par.get('superpose.symbol')$col,
corner=c(0,1)),
outerlabels=TRUE, autoarrange=TRUE,
col=colorspace::rainbow_hcl, ...)
## S3 method for class 'summaryP'
ggplot(data, mapping, groups=NULL, exclude1=TRUE,
xlim=c(0, 1), col=NULL, shape=NULL, size=function(n) n ^ (1/4),
sizerange=NULL, abblen=5, autoarrange=TRUE, addlayer=NULL,
..., environment)
## S3 method for class 'summaryP'
latex(object, groups=NULL, exclude1=TRUE, file='', round=3,
size=NULL, append=TRUE, ...)
formula |
a formula with the variables for whose levels
proportions are computed on the left hand side, and major
classification variables on the right. The formula need to include
any variable later used as |
data |
an optional data frame. For |
subset |
an optional subsetting expression or vector |
na.action |
function specifying how to handle |
sort |
set to |
asna |
character vector specifying level names to consider the
same as |
x |
an object produced by |
groups |
a character string containing the name of a superpositioning variable for obtaining further stratification within a horizontal line in the dot chart. |
marginVal |
if |
marginLabel |
specifies a different character string to use than
the value of |
refgroup |
used when doing a |
exclude1 |
By default, |
xlim |
|
text.at |
specify to leave unused space to the right of each
panel to prevent numerators and denominators from touching data
points. |
cex.values |
character size to use for plotting numerators and denominators |
key |
a list to pass to the |
outerlabels |
by default if there are two conditioning variables
besides |
autoarrange |
If |
col |
a vector of colors to use to override defaults in
|
shape |
a vector of plotting symbols to override |
mapping , environment |
not used; needed because of rules for generics |
size |
for |
sizerange |
a 2-vector specifying the |
abblen |
labels of variables having only one level and having
their name longer than |
... |
used only for |
object |
an object produced by |
file |
file name, defaults to writing to console |
round |
number of digits to the right of the decimal place for proportions |
append |
set to |
addlayer |
a |
summaryP
produces a data frame of class
"summaryP"
. The plot
method produces a lattice
object of class "trellis"
. The latex
method produces an
object of class "latex"
with an additional attribute
ngrouplevels
specifying the number of levels of any
groups
variable and an attribute nstrata
specifying the
number of strata.
Frank Harrell
Department of Biostatistics
Vanderbilt University
fh@fharrell.com
bpplotM
, summaryM
,
ynbind
, pBlock
,
ggplot
, colorFacet
n <- 100
f <- function(na=FALSE) {
x <- sample(c('N', 'Y'), n, TRUE)
if(na) x[runif(100) < .1] <- NA
x
}
set.seed(1)
d <- data.frame(x1=f(), x2=f(), x3=f(), x4=f(), x5=f(), x6=f(), x7=f(TRUE),
age=rnorm(n, 50, 10),
race=sample(c('Asian', 'Black/AA', 'White'), n, TRUE),
sex=sample(c('Female', 'Male'), n, TRUE),
treat=sample(c('A', 'B'), n, TRUE),
region=sample(c('North America','Europe'), n, TRUE))
d <- upData(d, labels=c(x1='MI', x2='Stroke', x3='AKI', x4='Migraines',
x5='Pregnant', x6='Other event', x7='MD withdrawal',
race='Race', sex='Sex'))
dasna <- subset(d, region=='North America')
with(dasna, table(race, treat))
s <- summaryP(race + sex + ynbind(x1, x2, x3, x4, x5, x6, x7, label='Exclusions') ~
region + treat, data=d)
# add exclude1=FALSE below to include female category
plot(s, groups='treat')
require(ggplot2)
ggplot(s, groups='treat')
plot(s, val ~ freq | region * var, groups='treat', outerlabels=FALSE)
# Much better looking if omit outerlabels=FALSE; see output at
# https://hbiostat.org/R/Hmisc/summaryFuns.pdf
# See more examples under bpplotM
## For plotly interactive graphic that does not handle variable size
## panels well:
## require(plotly)
## g <- ggplot(s, groups='treat')
## ggplotly(g, tooltip='text')
## For nice plotly interactive graphic:
## options(grType='plotly')
## s <- summaryP(race + sex + ynbind(x1, x2, x3, x4, x5, x6, x7,
## label='Exclusions') ~
## treat, data=subset(d, region='Europe'))
##
## plot(s, groups='treat', refgroup='A') # refgroup='A' does B-A differences
# Make a chart where there is a block of variables that
# are only analyzed for males. Keep redundant sex in block for demo.
# Leave extra space for numerators, denominators
sb <- summaryP(race + sex +
pBlock(race, sex, label='Race: Males', subset=sex=='Male') ~
region, data=d)
plot(sb, text.at=1.3)
plot(sb, groups='region', layout=c(1,3), key=list(space='top'),
text.at=1.15)
ggplot(sb, groups='region')
## Not run:
plot(s, groups='treat')
# plot(s, groups='treat', outerlabels=FALSE) for standard lattice output
plot(s, groups='region', key=list(columns=2, space='bottom'))
require(ggplot2)
colorFacet(ggplot(s))
plot(summaryP(race + sex ~ region, data=d), exclude1=FALSE, col='green')
require(lattice)
# Make your own plot using data frame created by summaryP
useOuterStrips(dotplot(val ~ freq | region * var, groups=treat, data=s,
xlim=c(0,1), scales=list(y='free', rot=0), xlab='Fraction',
panel=function(x, y, subscripts, ...) {
denom <- s$denom[subscripts]
x <- x / denom
panel.dotplot(x=x, y=y, subscripts=subscripts, ...) }))
# Show marginal summary for all regions combined
s <- summaryP(race + sex ~ region, data=addMarginal(d, region))
plot(s, groups='region', key=list(space='top'), layout=c(1,2))
# Show marginal summaries for both race and sex
s <- summaryP(ynbind(x1, x2, x3, x4, label='Exclusions', sort=FALSE) ~
race + sex, data=addMarginal(d, race, sex))
plot(s, val ~ freq | sex*race)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.