gx.stats: Function to Compute and Display Summary Statistics

Description Usage Arguments Details Value Note Author(s) See Also Examples

Description

Function to compute summary statistics for a 'one-page' report and display in inset. Function may be used stand-alone, and is used as an ‘engine’ for the gx.summary.* series of functions

Usage

1
2
gx.stats(xx, xlab = deparse(substitute(xx)), display = TRUE,
	iftell = TRUE)

Arguments

xx

name of the variable to be processed.

xlab

by default the character string for xx is used for the table title. An alternate title can be displayed with xlab = "text string", see Examples.

display

if display = TRUE the summary statistics are displayed on the current device. If display = FALSE output is suppressed.

iftell

by default the NA count is displayed by na.remove prior to the table of results from this function. When the function is used as a ‘stats’ engine the NA count display may be suppressed by the calling function when the NA count is to be displayed by that calling function.

Details

The summary statistics comprise the data minimum, maximum and percentile values, robust estimates of standard deviation, the Median Absolute Deviation (MAD) and the Inter Quartile Standard Deviation (IQSD), and the mean, variance, standard deviation (SD), coefficient of variation (CV%), and the 95% confidence bounds on the median. When the minimum data value is > 0 summary statistics are computed after a log10 data transformation and exported back to the calling function.

Value

stats

the computed summary statistics to be used in function inset, and by
gx.summary.* functions. The list returned, stats, is a 32-element vector, see below:

[1:10]

the minimum value, and the 1st, 2nd, 5th, 10th, 20th, 25th (Q1), 30th, 40th and 50th (Q2) percentiles.

[11:19]

the 60th, 70th, 75th (Q3), 80th 90th, 95th, 98th and 99th percentiles and the maximum value.

[20]

the sample size, N.

[21]

the Median Absolute Deviation (MAD).

[22]

the Inter-Quartile Standard Deviation (IQSD).

[23]

the data (sample) Mean.

[24]

the data (sample) Variance.

[25]

the data (sample) Standard Deviation (SD).

[26]

the Coefficient of Variation as a percentage (CV%).

[27]

the Lower 95% Confidence Limit on the Median.

[28]

the Upper 95% Confidence Limit on the Median.

[29]

the log10 transformed data (sample) Mean.

[30]

the log10 transformed data (sample) Variance.

[31]

the log10 transformed data (sample) SD.

[32]

the log10 transformed data (sample) CV%.

If the minimum data value is <= 0, then stats[29:32] <- NA.

Note

Any less than detection limit values represented by negative values, or zeros or other numeric codes representing blanks in the data, must be removed prior to executing this function, see ltdl.fix.df.

Any NAs in the data vector are removed prior to computation. Depending on the value of iftell, the NA count will be displayed, iftell = TRUE, or suppressed, iftell = FALSE.

The confidence bounds on the median are estimated via the binomial theorem, not by normal approximation.

Author(s)

Robert G. Garrett

See Also

ltdl.fix.df, remove.na

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Make test data available
data(kola.o)
attach(kola.o)

## Generates an initial display
gx.stats(Cu)

## Provides a more appropriate labelled display
gx.stats(Cu, xlab = "Cu (mg/kg) in <2 mm O-horizon soil")

## Detach test data
detach(kola.o)

Example output

Loading required package: MASS
Loading required package: fastICA

 Summary Statistics Display for: Cu 

 Data Set N =   617 
 Minimum =      2.69 		Maximum = 4080 
 Median =       9.69 		MAD Est = 5.145 
				IQR Est = 8.414 
 95% CI for the Median =	   9.07 to 10.4 

 Mean =         43.69 		S.D. =    245.5 
 Variance =     60260 		C.V. % =  561.85 

 Maximum Value          4080 
 99th Percentile        435.28 
 98th Percentile        241.08 
 95th Percentile        98.18 
 90th Percentile        47.38 
 80th Percentile        21.68 
 3rd Quartile (75th)    18.2 
 70th Percentile        16 
 60th Percentile        11.56 
 Median (50th)          9.69 
 40th Percentile        8.344 
 30th Percentile        7.372 
 1st Quartile (25th)    6.85 
 20th Percentile        6.552 
 10th Percentile        5.696 
  5th Percentile        5.06 
  2nd Percentile        4.7032 
  1st Percentile        4.4664 
 Minimum Value          2.69 


 Summary Statistics Display for: Cu (mg/kg) in <2 mm O-horizon soil 

 Data Set N =   617 
 Minimum =      2.69 		Maximum = 4080 
 Median =       9.69 		MAD Est = 5.145 
				IQR Est = 8.414 
 95% CI for the Median =	   9.07 to 10.4 

 Mean =         43.69 		S.D. =    245.5 
 Variance =     60260 		C.V. % =  561.85 

 Maximum Value          4080 
 99th Percentile        435.28 
 98th Percentile        241.08 
 95th Percentile        98.18 
 90th Percentile        47.38 
 80th Percentile        21.68 
 3rd Quartile (75th)    18.2 
 70th Percentile        16 
 60th Percentile        11.56 
 Median (50th)          9.69 
 40th Percentile        8.344 
 30th Percentile        7.372 
 1st Quartile (25th)    6.85 
 20th Percentile        6.552 
 10th Percentile        5.696 
  5th Percentile        5.06 
  2nd Percentile        4.7032 
  1st Percentile        4.4664 
 Minimum Value          2.69 

rgr documentation built on May 2, 2019, 6:09 a.m.

Related to gx.stats in rgr...