Function to Compute and Display Summary Statistics

Description

Function to compute summary statistics for a 'one-page' report and display in inset. Function may be used stand-alone, and is used as an ‘engine’ for the gx.summary.* series of functions

Usage

1
2
gx.stats(xx, xlab = deparse(substitute(xx)), display = TRUE,
	iftell = TRUE)

Arguments

xx

name of the variable to be processed.

xlab

by default the character string for xx is used for the table title. An alternate title can be displayed with xlab = "text string", see Examples.

display

if display = TRUE the summary statistics are displayed on the current device. If display = FALSE output is suppressed.

iftell

by default the NA count is displayed by na.remove prior to the table of results from this function. When the function is used as a ‘stats’ engine the NA count display may be suppressed by the calling function when the NA count is to be displayed by that calling function.

Details

The summary statistics comprise the data minimum, maximum and percentile values, robust estimates of standard deviation, the Median Absolute Deviation (MAD) and the Inter Quartile Standard Deviation (IQSD), and the mean, variance, standard deviation (SD), coefficient of variation (CV%), and the 95% confidence bounds on the median. When the minimum data value is > 0 summary statistics are computed after a log10 data transformation and exported back to the calling function.

Value

stats

the computed summary statistics to be used in function inset, and by
gx.summary.* functions. The list returned, stats, is a 32-element vector, see below:

[1:10]

the minimum value, and the 1st, 2nd, 5th, 10th, 20th, 25th (Q1), 30th, 40th and 50th (Q2) percentiles.

[11:19]

the 60th, 70th, 75th (Q3), 80th 90th, 95th, 98th and 99th percentiles and the maximum value.

[20]

the sample size, N.

[21]

the Median Absolute Deviation (MAD).

[22]

the Inter-Quartile Standard Deviation (IQSD).

[23]

the data (sample) Mean.

[24]

the data (sample) Variance.

[25]

the data (sample) Standard Deviation (SD).

[26]

the Coefficient of Variation as a percentage (CV%).

[27]

the Lower 95% Confidence Limit on the Median.

[28]

the Upper 95% Confidence Limit on the Median.

[29]

the log10 transformed data (sample) Mean.

[30]

the log10 transformed data (sample) Variance.

[31]

the log10 transformed data (sample) SD.

[32]

the log10 transformed data (sample) CV%.

If the minimum data value is <= 0, then stats[29:32] <- NA.

Note

Any less than detection limit values represented by negative values, or zeros or other numeric codes representing blanks in the data, must be removed prior to executing this function, see ltdl.fix.df.

Any NAs in the data vector are removed prior to computation. Depending on the value of iftell, the NA count will be displayed, iftell = TRUE, or suppressed, iftell = FALSE.

The confidence bounds on the median are estimated via the binomial theorem, not by normal approximation.

Author(s)

Robert G. Garrett

See Also

ltdl.fix.df, remove.na

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Make test data available
data(kola.o)
attach(kola.o)

## Generates an initial display
gx.stats(Cu)

## Provides a more appropriate labelled display
gx.stats(Cu, xlab = "Cu (mg/kg) in <2 mm O-horizon soil")

## Detach test data
detach(kola.o)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.