| irs | R Documentation |
Select a sample that is not spatially balanced from a point (finite), linear / linestring (infinite), or areal / polygon (infinite) sampling frame using the Independent Random Sampling (IRS) algorithm. The IRS algorithm accommodates unstratified and stratified sampling designs and allows for equal inclusion probabilities, unequal inclusion probabilities according to a categorical variable, and inclusion probabilities proportional to a positive auxiliary variable. Several additional sampling options are included, such as including legacy (historical) sites, requiring a minimum distance between sites, and selecting replacement sites.
irs(
sframe,
n_base,
stratum_var = NULL,
seltype = NULL,
caty_var = NULL,
caty_n = NULL,
aux_var = NULL,
legacy_var = NULL,
legacy_sites = NULL,
legacy_stratum_var = NULL,
legacy_caty_var = NULL,
legacy_aux_var = NULL,
mindis = NULL,
maxtry = 10,
n_over = NULL,
n_near = NULL,
wgt_units = NULL,
pt_density = NULL,
DesignID = "Site",
SiteBegin = 1,
sep = "-",
projcrs_check = TRUE
)
sframe |
A sampling frame as an |
n_base |
The base sample size required. If the sampling design is unstratified,
this is a single numeric value. If the sampling design is stratified, this is a named
vector or list whose names represent each stratum and whose values represent each
stratum's sample size. These names must match the values of the stratification
variable represented by |
stratum_var |
A character string containing the name of the column from
|
seltype |
A character string or vector indicating the inclusion probability type,
which must be one of following: |
caty_var |
A character string containing the name of the column from
|
caty_n |
A character vector indicating the expected sample size for each
level of |
aux_var |
A character string containing the name of the column from
|
legacy_var |
This argument can be used instead of |
legacy_sites |
An sf object with a |
legacy_stratum_var |
A character string containing the name of the column from
|
legacy_caty_var |
A character string containing the name of the column from
|
legacy_aux_var |
A character string containing the name of the column from
|
mindis |
A numeric value indicating the desired minimum distance between sampled
sites. If the sampling design is stratified and |
maxtry |
The number of maximum attempts to apply the minimum distance algorithm to obtain
the desired minimum distance between sites. Each iteration takes roughly as long as the
standard GRTS algorithm. Successive iterations will always contain at least as many
sites satisfying the minimum distance requirement as the previous iteration. The algorithm stops
when the minimum distance requirement is met or there are |
n_over |
The number of reverse hierarchically ordered (rho) replacement sites.
If the sampling design is unstratified, then
|
n_near |
The number of nearest neighbor (nn) replacement sites.
If the sampling design is unstratified, |
wgt_units |
The units used to compute the design weights. These
units must be standard units as defined by the |
pt_density |
A positive integer controlling the density of the GRTS approximation
for infinite sampling frames. The GRTS approximation for infinite sample
frames vastly improves computational efficiency by generating many finite points and
selecting a sample from the points. |
DesignID |
A character string indicating the naming structure for each
site's identifier selected in the sample, which is matched with |
SiteBegin |
A character string indicating the first number to use to match
with |
sep |
A character string that acts as a separator between
|
projcrs_check |
A check for whether the coordinates are projected. If |
n_base is the number of sites used to calculate
the design weights, which is typically the number of sites used in an analysis. When a panel sampling design is implemented, n_base is typically the
number of sites in all panels that will be sampled in the same temporal period –
n_base is not the total number of sites in all panels. The sum of n_base and
n_over is equal to the total number of sites to be visited for all panels plus
any replacement sites that may be required.
The sampling design sites and additional information about the sampling design. More specifically, it is, a list with five elements:
sites_legacy An sf object containing legacy sites. This is
NULL if legacy sites were not included in the sample.
sites_base An sf object containing the base sites. This is NULL
if n_base equals the number of legacy sites.
sites_over An sf object containing the reverse hierarchically
ordered replacement sites. This is NULL if no reverse hierarchically
ordered replacement sites were included in the sample.
sites_near An sf object containing the nearest neighbor
replacement sites. This is NULL if no nearest neighbor replacement
sites were included in the sample.
design A list documenting the specifications of this sampling design.
This can be checked to verify your sampling design ran as intended.
call The original function call.
stratum_var The name of the stratification variable in sframe.
This equals NULL if no stratification is used.
stratum The unique strata. This equals "None" if
the sampling design is unstratified.
n_base The base sample size per stratum.
seltype The selection type per stratum.
caty_var The name of the unequal probability variable in sframe.
This equals NULL if no unequal probability variable is used.
caty_n The expected sample sizes for each level of the
unequal probability grouping variable per stratum. This equals
NULL when seltype is not "unequal".
aux_var The name of the proportional probability (auxiliary) variable in sframe.
This equals NULL if no proportional probability variable is used.
legacy A logical variable indicating whether legacy sites
were included in the sample.
legacy_stratum_var The name of the stratification variable in legacy_sites.
Omitted if legacy sites are not used. This equals NULL if legacy sites were used but
no stratification variable is used.
legacy_caty_var The name of the unequal probability variable in legacy_sites.
Omitted if legacy sites are not used. This equals NULL if legacy sites were used but
no unequal probability variable is used.
legacy_aux_var The name of the proportional probability (auxiliary)
variable in legacy_sites.
Omitted if legacy sites are not used. This equals NULL if legacy sites
were used but no proportional probability variable is used.
mindis The minimum distance requirement desired. This
is NULL when no minimum distance requirement was applied.
n_over The reverse hierarchically ordered replacement
site sample sizes per stratum. If seltype is unequal,
this represents the expected sample sizes. This is NULL
when no reverse hierarchically ordered replacement sites were selected.
n_near The number of nearest neighbor replacement sites
desired. This is NULL when no nearest neighbor replacement
sites were selected.
When non-NULL, the sites_legacy, sites_base,
sites_over, and sites_near objects contain the original columns
in sframe and include a few additional columns. These additional columns
are
siteID A site identifier (as named using the DesignID
and SiteBegin arguments to grts()).
siteuse Whether the site is a legacy site (Legacy), base
site (Base), reverse hierarchically ordered replacement site
(Over), or nearest neighbor replacement site (Near).
replsite The replacement site ordering. replsite is
None if the site is not a replacement site, Next if it is
the next reverse hierarchically ordered replacement site to use, or
Near_, where the word following _ indicates the ordering of sites closest to
the originally sampled site.
lon_WGS84 Longitude coordinates using the WGS84 coordinate
system (EPSG:4326). Only given if coordinates are projected.
lat_WGS84 Latitude coordinates using the WGS84 coordinate
system (EPSG:4326). Only given if coordinates are projected.
X Longitude coordinates using the provided coordinate
system. Only given if coordinates are not projected (i.e., they are geographic or NA).
Y Latitude coordinates using the provided coordinate
system. Only given if coordinates are not projected (i.e., they are geographic or NA).
stratum A stratum indicator. stratum is None
if the sampling design was unstratified. If the sampling design was stratified,
stratum indicates the stratum.
wgt The design weight.
ip The site's original inclusion probability (the reciprocal)
of (wgt).
caty An unequal probability grouping indicator. caty
is None if the sampling design did not use unequal inclusion probabilities.
If the sampling design did use unequal inclusion probabilities, caty
indicates the unequal probability level.
aux The auxiliary proportional probability variable. This
column is only returned if seltype was proportional in the
original sampling design.
If any columns in sframe contain these names, those columns
from sframe will be automatically prefixed with sframe_
in the sites object. When output is printed, a summary of site counts by
the levels in stratum_var and caty_var is shown.
Tony Olsen olsen.tony@epa.gov
grtsto select a sample that is spatially balanced
## Not run:
samp <- irs(NE_Lakes, n_base = 100)
print(samp)
strata_n <- c(low = 25, high = 30)
samp_strat <- irs(NE_Lakes, n_base = strata_n, stratum_var = "ELEV_CAT")
print(samp_strat)
samp_over <- irs(NE_Lakes, n_base = 30, n_over = 5)
print(samp_over)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.