Description Details Author(s) References Examples
A comprehensive R package for environmental statistics and the successor to the SPLUS module EnvironmentalStats for SPLUS (first released in April, 1997). EnvStats provides a set of powerful functions for graphical and statistical analyses of environmental data, with a focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. It includes major environmental statistical methods found in the literature and regulatory guidance documents, and extensive help that explains what these methods do, how to use them, and where to find them in the literature. It also includes numerous builtin data sets from regulatory guidance documents and environmental statistics literature, and scripts reproducing analyses presented in the User's manual: EnvStats: An R Package for Environmental Statistics (Millard, 2013, http://www.springer.com/book/9781461484554).
For a complete list of functions and datasets, you can do any of the following:
See the help file Functions By Category for a listing of functions by category.
If you are in the online help, scroll to the bottom of this help page and click on the Index link.
Type library(help="EnvStats")
at the command prompt.
Note: The names of all EnvStats functions start with a lowercase letter, and
the names of all EnvStats datasets and data objects start an uppercase letter.
You can type newsEnvStats()
at the R command prompt for the latest news for
the EnvStats package.
Package:  EnvStats 
Type:  Package 
Version:  2.3.0 
Date:  20171009 
License:  GPL (>=3) 
LazyLoad:  yes 
A companion file EnvStatsmanual.pdf containing a listing of all the current help files is located on the R CRAN web site at https://cran.rproject.org/package=EnvStats/EnvStats.pdf and also in the doc subdirectory of the directory where the EnvStats package was installed. For example, if you installed R under Windows, this file might be located in the directory C:\Program Files\R*.**.*\library\EnvStats\doc, where *.**.* denotes the version of R you are using (e.g., 3.3.4) or in the directory C:\Users\Name\Documents\R\winlibrary\*.**.*\EnvStats\doc, where Name denotes your user name on the Windows operating system.
EnvStats comes with companion scripts, located in the scripts subdirectory of the directory where the package was installed. One set of scripts lets you reproduce the examples in the User's Manual. There are also scripts that let you reproduce examples from US EPA guidance documents.
See the References section below for documentation for the predecessor to EnvStats, EnvironmentalStats for SPLUS for Windows.
Features of EnvStats include:
New functions for computing summary statistics, as well as
creating summary plots to compare the distributions
of groups sidebyside, including functions specifically designed to work with
plots created with ggplot
(see Plotting Using ggplot2).
New probability distributions have been added to the ones already available in R, including the extreme value distribution and the zeromodified lognormal (delta) distribution. You can compute quantities associated with these probability distributions (probability density functions, cumulative distribution functions, and quantiles), and generate random numbers from these distributions.
Plot probability distributions so you can see how they change with the value of the distribution parameter(s).
Estimate distribution parameters and distribution quantiles, and compute confidence intervals for commonly used probability distributions, including special methods for the lognormal and gamma distributions.
Perform and plot the results of goodnessoffit tests:
Observed and Fitted Distributions
QuantileQuantile Plots
Results of ShaprioWilk test, KolmogorovSmirnov test, etc.
Includes a new generalized goodnessoffit test for any continuous distribution. Also includes a new function to choose among several candidate distributions.
Functions for assessing optimal BoxCox data transformations.
Compute parametric and nonparametric prediction intervals, simultaneous prediction intervals, and tolerance intervals.
New functions for hypothesis tests, including:
Nonparametric estimation and tests for seasonal trend
Fisher's onesample randomization (permutation) test for location
Quantile test to detect a shift in the tail of one population relative to another
Twosample linear rank tests
Test for serial correlation based on von Neumann rank test
Perform calibration based on a machine signal to determine decision and detection limits and report estimated concentrations along with confidence intervals.
Easily perform power and sample size computations and create companion plots for sampling designs based on confidence intervals, hypothesis tests, prediction intervals, and tolerance intervals.
Handle singly and multiply censored (lessthandetectionlimit) data:
Empirical CDF and QuantileQuantile Plots
Parameter/Quantile Estimation and Confidence Intervals
Prediction and Tolerance Intervals
GoodnessofFit Tests
Optimal BoxCox Transformations
TwoSample Rank Tests
Functions for performing Monte Carlo simulation and probabilistic risk assessement.
Reproduce specific examples in EPA guidance documents by using builtin data sets from these documents and running companion scripts.
Steven P. Millard
Maintainer: Steven P. Millard <[email protected]>
Millard, S.P. (2013). EnvStats: An R Package for Environmental Statistics. Springer, New York. http://www.springer.com/book/9781461484554.
Millard, S.P. (2002). EnvironmentalStats for SPLUS: User's Manual for Version 2.0. Second Edition. SpringerVerlag, New York.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with SPLUS. CRC Press, Boca Raton, FL.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159  # Look at plots and summary statistics for the TcCB data given in
# USEPA (1994b), (the data are stored in EPA.94b.tccb.df).
# Arbitrarily set the one censored observation to the censoring level.
# Group by the variable Area.
EPA.94b.tccb.df
# TcCB.orig TcCB Censored Area
#1 0.22 0.22 FALSE Reference
#2 0.23 0.23 FALSE Reference
#...
#46 1.20 1.20 FALSE Reference
#47 1.33 1.33 FALSE Reference
#48 <0.09 0.09 TRUE Cleanup
#49 0.09 0.09 FALSE Cleanup
#...
#123 51.97 51.97 FALSE Cleanup
#124 168.64 168.64 FALSE Cleanup
# First plot the data
#
dev.new()
stripChart(TcCB ~ Area, data = EPA.94b.tccb.df,
xlab = "Area", ylab = "TcCB (ppb)")
mtext("TcCB Concentrations by Area", line = 3, cex = 1.25, font = 2)
dev.new()
stripChart(log10(TcCB) ~ Area, data = EPA.94b.tccb.df,
p.value = TRUE,
xlab = "Area", ylab = expression(paste(log[10], " [ TcCB (ppb) ]")))
mtext(expression(paste(log[10], "(TcCB) Concentrations by Area")),
line = 3, cex = 1.25, font = 2)
#
# Now compute summary statistics
#
sum(EPA.94b.tccb.df$Censored)
#[1] 1
with(EPA.94b.tccb.df, TcCB[Censored])
#0.09
# Summary statistics will treat the one censored value
# as assuming the detection limit.
summaryFull(TcCB ~ Area, data = EPA.94b.tccb.df)
# Cleanup Reference
#N 77 47
#Mean 3.915 0.5985
#Median 0.43 0.54
#10% Trimmed Mean 0.6846 0.5728
#Geometric Mean 0.5784 0.5382
#Skew 7.717 0.9019
#Kurtosis 62.67 0.132
#Min 0.09 0.22
#Max 168.6 1.33
#Range 168.5 1.11
#1st Quartile 0.23 0.39
#3rd Quartile 1.1 0.75
#Standard Deviation 20.02 0.2836
#Geometric Standard Deviation 3.898 1.597
#Interquartile Range 0.87 0.36
#Median Absolute Deviation 0.3558 0.2669
#Coefficient of Variation 5.112 0.4739
summaryStats(TcCB ~ Area, data = EPA.94b.tccb.df, digits = 1)
# N Mean SD Median Min Max
#Cleanup 77 3.9 20.0 0.4 0.1 168.6
#Reference 47 0.6 0.3 0.5 0.2 1.3
#
# Compute ShapiroWilk GoodnessofFit statistic for the
# Reference Area TcCB data assuming a lognormal distribution
#
sw.list < gofTest(TcCB ~ 1, data = EPA.94b.tccb.df,
subset = Area == "Reference", dist = "lnorm")
sw.list
# Results of GoodnessofFit Test
# 
#
# Test Method: ShapiroWilk GOF
#
# Hypothesized Distribution: Lognormal
#
# Estimated Parameter(s): meanlog = 0.6195712
# sdlog = 0.4679530
#
# Estimation Method: mvue
#
# Data: TcCB
#
# Subset With: Area == "Reference"
#
# Data Source: EPA.94b.tccb.df
#
# Sample Size: 47
#
# Test Statistic: W = 0.978638
#
# Test Statistic Parameter: n = 47
#
# Pvalue: 0.5371935
#
# Alternative Hypothesis: True cdf does not equal the
# Lognormal Distribution.
#
# Plot results of GOF test
dev.new()
plot(sw.list)
#
# Based on the Reference Area data, estimate 90th percentile
# and compute a 95% confidence limit for the 90th percentile
# assuming a lognormal distribution.
#
with(EPA.94b.tccb.df,
eqlnorm(TcCB[Area == "Reference"], p = 0.9, ci = TRUE))
# Results of Distribution Parameter Estimation
# 
#
# Assumed Distribution: Lognormal
#
# Estimated Parameter(s): meanlog = 0.6195712
# sdlog = 0.4679530
#
# Estimation Method: mvue
#
# Estimated Quantile(s): 90'th %ile = 0.9803307
#
# Quantile Estimation Method: qmle
#
# Data: TcCB[Area == "Reference"]
#
# Sample Size: 47
#
# Confidence Interval for: 90'th %ile
#
# Confidence Interval Method: Exact
#
# Confidence Interval Type: twosided
#
# Confidence Level: 95%
#
# Confidence Interval: LCL = 0.8358791
UCL = 1.2154977
#
# Cleanup
rm(TcCB.ref, sw.list)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.