RunAnalysis: Run Trend Analysis

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/RunAnalysis.R

Description

This function analyses observations for a significant trend.

Usage

1
2
3
4
5
RunAnalysis(processed.obs, processed.config, path, id, sdate = NA,
  edate = NA, control = survreg.control(iter.max = 100),
  sig.level = 0.05, graphics.type = "", merge.pdfs = TRUE,
  site.locations = NULL, is.seasonality = FALSE,
  explanatory.var = NULL, is.residual = FALSE, thin.obs.mo = NULL)

Arguments

processed.obs

list. See documentation for ProcessObs function for details.

processed.config

data.frame. See documentation for ProcessConfig function for details.

path

character. Path name of the folder where output data is written.

id

character. An analysis identifier that is used to construct output file names.

sdate, edate

Date or character. Start and end date corresponding to the period of interest, respectively. The required date format is YYYY-M-D (%Y-%m-%d).

control

list. Regression control values in the format produced by the survreg.control function.

sig.level

numeric. Significance level to be coupled with the p-value, see ‘Value’ section.

graphics.type

character. Graphics type for plot figures. The default is the ‘active’ device, typically the normal screen device. A file-based device can be selected by specifying either pdf or postscript.

merge.pdfs

logical. If true and graphics.type = "pdf" the figures are combined into a single PDF file, see documentation for MergePDFs function for details.

site.locations

SpatialPointsDataFrame. Geo-referenced site coordinates with a required data.frame component of Site_id, a unique site identifier.

is.seasonality

logical. If true, seasonal patterns are modeled by a trigonometric regression; as covariates in the trend model.

explanatory.var

data.frame. An explanatory variable added to the covariates of the trend model, see value from the ProcessWL function for the data table format. Explanatory variable values are linearly interpolated at sample dates in processed.obs.

is.residual

logical. If true, the explanatory variable is transformed using its residuals from linear regression. Should be used when the explanatory variable is monotonically increasing or decreasing during the entire trend period. Requires specification of the explanatory.var argument.

thin.obs.mo

character. Full name of a calendar month; if specified, data is thinned to one observation per year collected during this month. Allows verification that the variable sampling frequencies don't substantially affect the trend results. Thinning the data also could remove serial correlation in the more frequently sampled years.

Details

The survreg function is used to fit a parametric survival regression model to the observed data, both censored and uncensored. The specific class of survival model is known as the accelerated failure time (AFT) model. A maximum-likelihood estimation (MLE) method is used to estimate parameters in the AFT model. The MLE is solved by maximizing the log-likelihood using the Newton-Raphson method, an iterative root-finding algorithm. The likelihood function is dependent on the distribution of the observed data. Data is assumed to follow a log-normal distribution because most of the variables have values spanning two or more orders of magnitude. If all observations are uncensored, the survival regression becomes identical to ordinary least squares regression.

Value

Returns a data.frame object with the following components:

Site_id

unique site identifier

Site_name

local site name

Parameter_id

unique parameter identifier

Parameter_name

common parameter name

sdate,edate

start and end date corresponding to the period of interest, respectively.

n

number of observations in the analysis.

nmissing

number of missing values.

nexact

number of exact (uncensored) observations.

nleft

number of left-censored observations.

ninterval

number of interval-censored observations.

nbelow.rl

number of observations that are below the reporting level.

min,max

minimum and maximum, respectively.

median

median

mean,sd

mean and standard deviation, respectively. Set to NA if censored data is present.

iter

number of Newton-Raphson iterations required for convergence. If NA, the regression failed or ran out of iterations and did not converge.

slope

slope of the linear trend over time in percent change per year.

std.err

standard error for the linear trend over time in percent change per year.

p

p-value for the linear trend over time.

p.model

p-value for the parametric survival regression model.

trend

significant trends are indicated by a p-value (p) less than or equal to the significance level. The sign of the slope indicates whether the significant trend is positive (+) or negative (-). emphp-values greater than the significance level are specified as having no significant trend (none).

If arguments path and id are specified, the returned data table of summary statistics (described above) is written to an external text file. If in addition a file-based graphics type is selected, plots are drawn to external files.

Author(s)

J.C. Fisher and L.C. Davis, U.S. Geological Survey, Idaho Water Science Center

See Also

DrawPlot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Specify global arguments for reading table formatted data in a text file
read.args <- list(header = TRUE, sep = "\t", colClasses = "character",
                  na.strings = "", fill = TRUE, strip.white = TRUE,
                  comment.char = "", flush = TRUE, stringsAsFactors = FALSE)

# Read input files
path.in <- system.file("extdata", package = "Trends")
file <- file.path(path.in, "Observations.tsv")
observations <- do.call(read.table, c(list(file), read.args))
file <- file.path(path.in, "Parameters.tsv")
parameters <- do.call(read.table, c(list(file), read.args))
file <- file.path(path.in, "Detection_Limits.tsv")
detection.limits <- do.call(read.table, c(list(file), read.args))
file <- file.path(path.in, "Config_VOC.tsv")
config <- do.call(read.table, c(list(file), read.args))

# Process observations
processed.obs <- ProcessObs(observations, parameters, detection.limits,
                            date.fmt = "\%m/\%d/\%Y")

# Plot data for a single parameter at a specific site
d <- processed.obs[["P32102"]]
d <- d[d$Site_id == "433002113021701", c("Date", "surv")]
DrawPlot(d, main = "RWMC Production", ylab = "Carbon Tetrachloride")

# Configure sites, parameters, and duration for analysis
processed.config <- tail(ProcessConfig(config, processed.obs))

# Run analysis
stats <- RunAnalysis(processed.obs, processed.config,
                     sdate = "1987-01-01", edate = "2012-12-31")

jfisher-usgs/Trends documentation built on May 19, 2019, 7:16 a.m.