WREG.RoI: Region-of-Influence Regression (WREG)
In USGS-R/WREG: USGS WREG v. 2.02

Description Usage Arguments Details Value Examples

WREG.ROI implements region-of-influence regression in the WREG framework.

WREG.RoI(Y, X, Reg, transY = NA, recordLengths = NA, LP3 = NA,
  regSkew = FALSE, alpha = 0.01, theta = 0.98, BasinChars = NA,
  MSEGR = NA, TY = 2, Peak = T, ROI = c("PRoI", "GRoI", "HRoI"),
  n = NA, D = 250, DistMeth = 2, Legacy = FALSE)

`Y`	The dependent variable of interest, with any transformations already applied.
`X`	The independent variables in the regression, with any transformations already applied. Each row represents a site and each column represents a particular independe variable. (If a leading constant is used, it should be included here as a leading column of ones.) The rows must be in the same order as the dependent variables in `Y`.
`Reg`	A string indicating which type of regression should be applied. The options include: “OLS” for ordinary least-squares regression, “WLS” for weighted least-squares regression, “GLS” for generalized least-squares regression, with no uncertainty in regional skew and “GLSskew” for generalized least-squares regression with uncertainty in regional skew. (In the case of “GLSskew”, the uncertainty in regional skew must be provided as the mean squared error in regional skew.)
`transY`	A required character string indicating if the the dependentvariable was transformed by the common logarithm ('log10'), transformed by the natural logarithm ('ln') or untransformed ('none').
`recordLengths`	This input is required for “WLS”, “GLS” and “GLSskew”. For “GLS” and “GLSskew”, `recordLengths` should be a matrix whose rows and columns are in the same order as `Y`. Each `(r,c)` element represents the length of concurrent record between sites `r` and `c`. The diagonal elements therefore represent each site's full record length. For “WLS”, the only the at-site record lengths are needed. In the case of “WLS”, `recordLengths` can be a vector or the matrix described for “GLS” and “GLSskew”.
`LP3`	A dataframe containing the fitted Log-Pearson Type III standard deviate, standard deviation and skew for each site. The names of this data frame are `S`, `K` and `G`. For “GLSskew”, the regional skew value must also be provided in a variable called `GR`. The order of the rows must be the same as `Y`.
`regSkew`	A logical vector indicating if regional skews are provided with an adjustment required for uncertainty therein (`TRUE`). The default is `FALSE`.
`alpha`	A number, required only for “GLS” and “GLSskew”. `alpha` is a parameter used in the estimated cross-correlation between site records. See equation 20 in the WREG v. 1.05 manual. The arbitrary, default value is 0.01. The user should fit a different value as needed.
`theta`	A number, required only for “GLS” and “GLSskew”. `theta` is a parameter used in the estimated cross-correlation between site records. See equation 20 in the WREG v. 1.05 manual. The arbitrary, default value is 0.98. The user should fit a different value as needed.
`BasinChars`	A dataframe containing three variables: `StationID` is the numerical identifier (without a leading zero) of each site, `Lat` is the latitude of the site, in decimal degrees, and `Long` is the longitude of the site, in decimal degrees. The sites must be presented in the same order as `Y`. Required only for “GLS”.
`MSEGR`	A number. The mean squared error of the regional skew.
`TY`	A number. The return period of the event being modeled. Required only for “GLSskew”. The default value is `2`. (See the `Legacy` details below.)
`Peak`	A logical. Indicates if the event being modeled is a peak flow event or a low-flow event. `TRUE` indicates a peak flow, while `FALSE` indicates a low-flow event.
`ROI`	A string indicating how to define the region of influence. “PRoI” signifies physiographic, independent or predictor-variable region of influence. “GRoI” calls for a geographic region of influence. “HRoI” requests a hybrid region of influence. Details on each approach are provided in the manual for WREG v. 1.0.
`n`	The number of sites to include in the region of influence.
`D`	Required for “HRoI”, the geographic limit within which to search for a physiographic region of influence. In WREG v. 1.05 (see `Legacy` below), this is interpretted in meters. Elsewise, this is interpretted as miles.
`DistMeth`	Required for “GLS” and “GLSskew”. A value of `1` indicates that the "Nautical Mile" approximation should be used to calculate inter-site distances. A value of `2` designates the Haversine approximation. See `Dist.WREG`. The default value is `2`. (See the `Legacy` details below.)
`Legacy`	A logical. A value of `TRUE` forces the WREG program to behave identically to WREG v. 1.05, with BUGS and all. It will force `TY=2` and `DistMeth=1`. For ROI regressions, it will also require a specific calculation for weighing matrices in “WLS” (`Omega.WLS.ROImatchMatLab`), “GLS”, and “GLSskew” (see `Omega.GLS.ROImatchMatLab`). `Legacy` also forces the distance limit `D` to be interpretted in meters.

The support for region-of-influence regression is described in the manual of WREG v. 1.0. WREG.RoI iterates through the sites of Y, defines a region of influence and implements the specified regression by calling a WREG function.

The logical handle Legacy has been included to test that this program returns the same results as WREG v. 1.05. In the development of this code, some idiosyncrasies of the MatLab code for WREG v. 1.05 became apparent. Setting Legacy equal to TRUE forces the code to use the same idiosycrasies as WREG v. 1.05. Some of these idosyncrasies may be bugs in the code. Further analysis is needed. For information on the specific idiosyncrasies, see the notes for the Legacy input and the links to other functions in this package.

As with other WREG functions, WREG.RoI returns a large list of regression parameters and metrics. This list varies depending on the Reg specified, but may contain:

`fitted.values`	A vector of model estimates from the regression model.
`residuals`	A vector of model residuals.
`PerformanceMetrics`	A list of four elements. These represent approximate performance regression across all of the region-of-influence regressions. These include the mean squared error of residuals (`MSE`), the coefficient of determination (`R2`), the adjusted coefficient of determination (`R2_adj`) and the root mean squared error (`RMSE`, in percent). Details on the appropriateness and applicability of performance metrics can be found in the WREG manual.
`Coefficients`	A list composed of four elements: (1) `Values` contains the regression coefficeints estimated for the model built around each observation, (2) `StanError` contains the standard errors of each regression coefficient for the ROI regressions around each observations, (3) `TStatistic` contains the Student's T-statistic of each regression coefficient for the ROI regression built around each observation and (4) `pValue` contains the significance probability of each regression coefficient for the ROI regressions built around each observation. Each element of the list is a matrix the same size as `X`
`ROI.Regressions`	A list of elements and outputs from each individual ROI regression. These include a matrix of the sites used in each ROI regression (`Sites.Used`, a `length(Y)`-by-`n` matrix), a matrix of the geographic distances between the selected sites in each ROI regression ( `Gdist.Used`, a `length(Y)`-by-`n` matrix), a matrix of the physiographic distances between the selected sites in each ROI regression ( `Pdist.Used`, a `length(Y)`-by-`n` matrix), a matrix of the observations used in each ROI regression (`Obs.Used`, a `length(Y)`-by- `n` matrix), a matrix of model fits in the region of influence (`Fits`, a `length(Y)`-by-`n` matrix), a matrix of model residuals in the region of influence (`Residuals`, a `length(Y)`-by-`n` matrix), a matrix of leverages for each ROI regression (`Leverage`, a `length(Y)`-by-`n` matrix), a matrix of influences for each ROI regression (`Influence`, a `length(Y)`-by-`n` matrix), a logical matrix indicating if the leverage is significant in each ROI regression (`Leverage.Significance`, a `length(Y)` -by-`n` matrix), a logical matrix indicating if the influence is significant in each ROI regression (`Influence.Significance`, a `length(Y)` -by-`n` matrix), a vector of critical leverage values for each ROI regression (`Leverage.Limits`), a vector of critical influence values for each ROI regression (`Influence.Limits`) and a list of performance metrics for each ROI regression (`PerformanceMetrics`). The last element, `PerformanceMetrics` is identical to the same output from other functions excpet that every element is multiplied by the number of observations so as to capture the individual performance of each ROI regression.
`ROI.InputParameters`	A list of input parameters to record the controls on the ROI regression. `D` idicates the limit used in “HRoI”. `n` indicates the size of the region of influence. `ROI` is a string indicating the type of region of influence. `Legacy` is a logical indicating if the WREG v. 1.05 idiosycrasies were implemented.

# Import some example data
peakFQdir <- paste0(
  file.path(system.file("exampleDirectory", package = "WREG"),
    "pfqImport"))
gisFilePath <- file.path(peakFQdir, "pfqSiteInfo.txt")
importedData <- importPeakFQ(pfqPath = peakFQdir, gisFile = gisFilePath)

# Run a simple regression
Y <- importedData$Y$AEP_0.5
X <- importedData$X[c("A")]
transY <- "none"
basinChars <- importedData$BasChars
#result <- WREG.OLS(Y, X, transY)

result <- WREG.RoI(Y, X, Reg = "OLS", transY, BasinChars = basinChars,
  ROI='GRoI', n = 10L)