WREG.GLS: Weighted-Multiple-Linear Regression Program (WREG)
In wfarmer-usgs/WREG: USGS WREG v. 3.00

Description Usage Arguments Details Value Examples

The WREG.GLS function executes the multiple linear regression analysis using generalized least-squares regression.

WREG.GLS(
  Y,
  X,
  recordLengths,
  LP3,
  basinChars,
  transY,
  x0 = NA,
  alpha = 0.01,
  theta = 0.98,
  peak = T,
  distMeth = 2,
  regSkew = FALSE,
  MSEGR = NA,
  TY = 2,
  legacy = FALSE
)

`Y`	A numeric vector of the dependent variable of interest, with any transformations already applied.
`X`	A numeric matrix of the independent variables in the regression, with any transformations already applied. Each row represents a site and each column represents a particular independe variable. (If a leading constant is used, it should be included here as a leading column of ones.) The rows must be in the same order as the dependent variables in `Y`.
`recordLengths`	A numeric matrix whose rows and columns are in the same order as `Y`. Each `(r,c)` element represents the length of concurrent record between sites `r` and `c`. The diagonal elements therefore represent each site's full record length.
`LP3`	A numeric matrix containing the fitted Log-Pearson Type III standard deviate, standard deviation and skew for each site. The columns of the matrix represent S, K, G, and an option regional skew value `GR` required by WREG.GLS with regSkew = TRUE. The order of the rows must be the same as `Y`.
`basinChars`	A dataframe containing three variables: `StationID` is the identifier of each site, `Lat` is the latitude of the site, in decimal degrees, and `Long` is the longitude of the site, in decimal degrees. The sites must be presented in the same order as `Y`.
`transY`	A required character string indicating if the the dependentvariable was transformed by the common logarithm ('log10'), transformed by the natural logarithm ('ln') or untransformed ('none').
`x0`	A vector containing the independent variables (as above) for a particular target site. This variable is only used for ROI analysis.
`alpha`	A numeric. `alpha` is a parameter used in the estimated cross-correlation between site records. See equation 20 in the WREG v. 1.05 manual. The arbitrary, default value is 0.01. The user should fit a different value as needed.
`theta`	A numeric. `theta` is a parameter used in the estimated cross-correlation between site records. See equation 20 in the WREG v. 1.05 manual. The arbitrary, default value is 0.98. The user should fit a different value as needed.
`peak`	A logical. Indicates if the event being modeled is a peak flow event or a low-flow event. `TRUE` indicates a peak flow, while `FALSE` indicates a low-flow event.
`distMeth`	A numeric. A value of `1` indicates that the "Nautical Mile" approximation should be used to calculate inter-site distances. A value of `2` designates the Haversine approximation. See `Dist.WREG`. The default value is `2`. (See the `Legacy` details below.)
`regSkew`	A logical vector indicating if regional skews are provided with an adjustment required for uncertainty therein (`TRUE`). The default is `FALSE`.
`MSEGR`	A numeric. The mean squared error of the regional skew. Required only if `regSkew = TRUE`.
`TY`	A numeric. The return period of the event being modeled. Required only for “GLSskew”. The default value is `2`. (See the `Legacy` details below.)
`legacy`	A logical. A value of `TRUE` forces the WREG program to behave identically to WREG v. 1.05, with BUGS and all. It will force `TY=2` and `DistMeth=1`. For ROI regressions, it will also require a specific calculation for weighing matrices in “WLS” (`Omega.WLS.ROImatchMatLab`), “GLS”, and “GLSskew” (see `Omega.GLS.ROImatchMatLab`) Further details are provided in `WREG.RoI`

In this implementation, the weights for generalized least-squares regression are defined by intersite correlations and record lengths. See manual for details.

The logical handle Legacy has been included to test that this program returns the same results as WREG v. 1.05. In the development of this code, some idiosyncrasies of the MatLab code for WREG v. 1.05 became apparent. Setting Legacy equal to TRUE forces the code to use the same idiosycrasies as WREG v. 1.05. Some of these idosyncrasies may be bugs in the code. Further analysis is needed. For information on the specific idiosyncrasies, see the notes for the Legacy input and the links to other functions in this package.

All outputs are returned as part of a list. The elements of the list depend on the type of regression performed. The elements of the list may include:

`Coefs`	A data frame composed of four variables: (1) `Coefficient` contains the regression coefficeints estimated for the model, (2) `Standard Error` contains the standard errors of each regression coefficient, (3) `tStatistic` contains the Student's T-statistic of each regression coefficient and (4) `pValue` contains the significance probability of each regression coefficient.
`ResLevInf`	A data frame composed of four variables for each site in the regression. `Residual` contains the model residuals. `Leverage` contains the leverage of each site. `Influence` contains the influence of each site. `VarPred` contains the variance of prediction at each site.
`LevLim`	The critical value of leverage. See `Leverage`
`InflLim`	The critical value of influence. See `Influence`
`LevInf.Sig`	A logical matrix indicating if the leverage (column 1) is significant and the influence (column 2) is significant for each site in the regression.
`PerformanceMetrics`	A list of not more than ten elements. All regression types return the mean squared error of residuals (`MSE`), the coefficient of determination (`R2`), the adjusted coefficient of determination (`R2_adj`) and the root mean squared error (`RMSE`, in percent). The pseudo coefficient of regression (`R2_pseudo`), the average variance of prediction (`AVP`), the standard error of prediction (`Sp`, in percent), a vector of the individual variances of prediction for each site (`VP.PredVar`), the model-error variance (`ModErrVar`) and the standardized model error variance (`StanModErr`, in percent) are also returned. Details on the appropriateness and applicability of performance metrics can be found in the WREG manual.
`X`	The input predictors.
`Y`	The input observations.
`fitted.values`	A vector of model estimates from the regression model.
`residuals`	A vector of model residuals.
`Weighting`	The weighting matrix used to develop regression estimates.
`Input`	A list of input parameters for error searching. Currently empty.

# Import some example data
peakFQdir <- paste0(
  file.path(system.file("exampleDirectory", package = "WREG"),
    "pfqImport"))
gisFilePath <- file.path(peakFQdir, "pfqSiteInfo.txt")
importedData <- importPeakFQ(pfqPath = peakFQdir, gisFile = gisFilePath)

# Organizing input data
lp3Data <- importedData$LP3f
lp3Data$K <- importedData$LP3k$AEP_0.5
Y <- importedData$Y$AEP_0.5
X <- importedData$X[c("Sand", "OutletElev", "Slope")]
recordLengths <- importedData$recLen
basinChars <- importedData$BasChars
transY <- "none"

# Run GLS regression
result <- WREG.GLS(Y, X, recordLengths, LP3 = lp3Data, basinChars, transY)