# smean.sd: Compute Summary Statistics on a Vector In Hmisc: Harrell Miscellaneous

## Description

A number of statistical summary functions is provided for use with `summary.formula` and `summarize` (as well as `tapply` and by themselves). `smean.cl.normal` computes 3 summary variables: the sample mean and lower and upper Gaussian confidence limits based on the t-distribution. `smean.sd` computes the mean and standard deviation. `smean.sdl` computes the mean plus or minus a constant times the standard deviation. `smean.cl.boot` is a very fast implementation of the basic nonparametric bootstrap for obtaining confidence limits for the population mean without assuming normality. These functions all delete NAs automatically. `smedian.hilow` computes the sample median and a selected pair of outer quantiles having equal tail areas.

## Usage

 ```1 2 3 4 5 6 7 8 9``` ```smean.cl.normal(x, mult=qt((1+conf.int)/2,n-1), conf.int=.95, na.rm=TRUE) smean.sd(x, na.rm=TRUE) smean.sdl(x, mult=2, na.rm=TRUE) smean.cl.boot(x, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE) smedian.hilow(x, conf.int=.95, na.rm=TRUE) ```

## Arguments

 `x` for summary functions `smean.*`, `smedian.hilow`, a numeric vector from which NAs will be removed automatically `na.rm` defaults to `TRUE` unlike built-in functions, so that by default `NA`s are automatically removed `mult` for `smean.cl.normal` is the multiplier of the standard error of the mean to use in obtaining confidence limits of the population mean (default is appropriate quantile of the t distribution). For `smean.sdl`, `mult` is the multiplier of the standard deviation used in obtaining a coverage interval about the sample mean. The default is `mult=2` to use plus or minus 2 standard deviations. `conf.int` for `smean.cl.normal` and `smean.cl.boot` specifies the confidence level (0-1) for interval estimation of the population mean. For `smedian.hilow`, `conf.int` is the coverage probability the outer quantiles should target. When the default, 0.95, is used, the lower and upper quantiles computed are 0.025 and 0.975. `B` number of bootstrap resamples for `smean.cl.boot` `reps` set to `TRUE` to have `smean.cl.boot` return the vector of bootstrapped means as the `reps` attribute of the returned object

## Value

a vector of summary statistics

## Author(s)

Frank Harrell
Department of Biostatistics
Vanderbilt University
[email protected]

`summarize`, `summary.formula`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20``` ```set.seed(1) x <- rnorm(100) smean.sd(x) smean.sdl(x) smean.cl.normal(x) smean.cl.boot(x) smedian.hilow(x, conf.int=.5) # 25th and 75th percentiles # Function to compute 0.95 confidence interval for the difference in two means # g is grouping variable bootdif <- function(y, g) { g <- as.factor(g) a <- attr(smean.cl.boot(y[g==levels(g)[1]], B=2000, reps=TRUE),'reps') b <- attr(smean.cl.boot(y[g==levels(g)[2]], B=2000, reps=TRUE),'reps') meandif <- diff(tapply(y, g, mean, na.rm=TRUE)) a.b <- quantile(b-a, c(.025,.975)) res <- c(meandif, a.b) names(res) <- c('Mean Difference','.025','.975') res } ```