WREG.GLS: Weighted-Multiple-Linear Regression Program (WREG)

Description Usage Arguments Details Value Examples

View source: R/WREG.GLS.R

Description

The WREG.GLS function executes the multiple linear regression analysis using generalized least-squares regression.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
WREG.GLS(
  Y,
  X,
  recordLengths,
  LP3,
  basinChars,
  transY,
  x0 = NA,
  alpha = 0.01,
  theta = 0.98,
  peak = T,
  distMeth = 2,
  regSkew = FALSE,
  MSEGR = NA,
  TY = 2,
  legacy = FALSE
)

Arguments

Y

A numeric vector of the dependent variable of interest, with any transformations already applied.

X

A numeric matrix of the independent variables in the regression, with any transformations already applied. Each row represents a site and each column represents a particular independe variable. (If a leading constant is used, it should be included here as a leading column of ones.) The rows must be in the same order as the dependent variables in Y.

recordLengths

A numeric matrix whose rows and columns are in the same order as Y. Each (r,c) element represents the length of concurrent record between sites r and c. The diagonal elements therefore represent each site's full record length.

LP3

A numeric matrix containing the fitted Log-Pearson Type III standard deviate, standard deviation and skew for each site. The columns of the matrix represent S, K, G, and an option regional skew value GR required by WREG.GLS with regSkew = TRUE. The order of the rows must be the same as Y.

basinChars

A dataframe containing three variables: StationID is the identifier of each site, Lat is the latitude of the site, in decimal degrees, and Long is the longitude of the site, in decimal degrees. The sites must be presented in the same order as Y.

transY

A required character string indicating if the the dependentvariable was transformed by the common logarithm ('log10'), transformed by the natural logarithm ('ln') or untransformed ('none').

x0

A vector containing the independent variables (as above) for a particular target site. This variable is only used for ROI analysis.

alpha

A numeric. alpha is a parameter used in the estimated cross-correlation between site records. See equation 20 in the WREG v. 1.05 manual. The arbitrary, default value is 0.01. The user should fit a different value as needed.

theta

A numeric. theta is a parameter used in the estimated cross-correlation between site records. See equation 20 in the WREG v. 1.05 manual. The arbitrary, default value is 0.98. The user should fit a different value as needed.

peak

A logical. Indicates if the event being modeled is a peak flow event or a low-flow event. TRUE indicates a peak flow, while FALSE indicates a low-flow event.

distMeth

A numeric. A value of 1 indicates that the "Nautical Mile" approximation should be used to calculate inter-site distances. A value of 2 designates the Haversine approximation. See Dist.WREG. The default value is 2. (See the Legacy details below.)

regSkew

A logical vector indicating if regional skews are provided with an adjustment required for uncertainty therein (TRUE). The default is FALSE.

MSEGR

A numeric. The mean squared error of the regional skew. Required only if regSkew = TRUE.

TY

A numeric. The return period of the event being modeled. Required only for “GLSskew”. The default value is 2. (See the Legacy details below.)

legacy

A logical. A value of TRUE forces the WREG program to behave identically to WREG v. 1.05, with BUGS and all. It will force TY=2 and DistMeth=1. For ROI regressions, it will also require a specific calculation for weighing matrices in “WLS” (Omega.WLS.ROImatchMatLab), “GLS”, and “GLSskew” (see Omega.GLS.ROImatchMatLab) Further details are provided in WREG.RoI

Details

In this implementation, the weights for generalized least-squares regression are defined by intersite correlations and record lengths. See manual for details.

The logical handle Legacy has been included to test that this program returns the same results as WREG v. 1.05. In the development of this code, some idiosyncrasies of the MatLab code for WREG v. 1.05 became apparent. Setting Legacy equal to TRUE forces the code to use the same idiosycrasies as WREG v. 1.05. Some of these idosyncrasies may be bugs in the code. Further analysis is needed. For information on the specific idiosyncrasies, see the notes for the Legacy input and the links to other functions in this package.

Value

All outputs are returned as part of a list. The elements of the list depend on the type of regression performed. The elements of the list may include:

Coefs

A data frame composed of four variables: (1) Coefficient contains the regression coefficeints estimated for the model, (2) Standard Error contains the standard errors of each regression coefficient, (3) tStatistic contains the Student's T-statistic of each regression coefficient and (4) pValue contains the significance probability of each regression coefficient.

ResLevInf

A data frame composed of four variables for each site in the regression. Residual contains the model residuals. Leverage contains the leverage of each site. Influence contains the influence of each site. VarPred contains the variance of prediction at each site.

LevLim

The critical value of leverage. See Leverage

InflLim

The critical value of influence. See Influence

LevInf.Sig

A logical matrix indicating if the leverage (column 1) is significant and the influence (column 2) is significant for each site in the regression.

PerformanceMetrics

A list of not more than ten elements. All regression types return the mean squared error of residuals (MSE), the coefficient of determination (R2), the adjusted coefficient of determination (R2_adj) and the root mean squared error (RMSE, in percent). The pseudo coefficient of regression (R2_pseudo), the average variance of prediction (AVP), the standard error of prediction (Sp, in percent), a vector of the individual variances of prediction for each site (VP.PredVar), the model-error variance (ModErrVar) and the standardized model error variance (StanModErr, in percent) are also returned. Details on the appropriateness and applicability of performance metrics can be found in the WREG manual.

X

The input predictors.

Y

The input observations.

fitted.values

A vector of model estimates from the regression model.

residuals

A vector of model residuals.

Weighting

The weighting matrix used to develop regression estimates.

Input

A list of input parameters for error searching. Currently empty.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Import some example data
peakFQdir <- paste0(
  file.path(system.file("exampleDirectory", package = "WREG"),
    "pfqImport"))
gisFilePath <- file.path(peakFQdir, "pfqSiteInfo.txt")
importedData <- importPeakFQ(pfqPath = peakFQdir, gisFile = gisFilePath)

# Organizing input data
lp3Data <- importedData$LP3f
lp3Data$K <- importedData$LP3k$AEP_0.5
Y <- importedData$Y$AEP_0.5
X <- importedData$X[c("Sand", "OutletElev", "Slope")]
recordLengths <- importedData$recLen
basinChars <- importedData$BasChars
transY <- "none"

# Run GLS regression
result <- WREG.GLS(Y, X, recordLengths, LP3 = lp3Data, basinChars, transY)

wfarmer-usgs/WREG documentation built on July 24, 2020, 1:28 a.m.