GeoDistMOS | R Documentation |
Split geographic PSUs into new geographically contiguous PSUs based on a maximum measure of size for each PSU
GeoDistMOS(lat, long, psuID, n, MOS.var, MOS.takeall = 1, Input.ID = NULL)
lat |
latitude variable in an input file. Must be in decimal format. |
long |
longitude variable in an input file. Must be in decimal format. |
psuID |
PSU Cluster ID from an input file. |
n |
Sample size of PSUs; may be a preliminary value used in the computation to identify certainty PSUs |
MOS.var |
Variable used for probability proportional to size sampling |
MOS.takeall |
Threshold relative measure of size value for certainties; must satisfy 0 < |
Input.ID |
ID variable from the input file |
GeoDistMOS
splits geographic primary sampling units (PSUs) in the input object based on a variable which is used to create the measure of size for each PSU (MOS.var
). The goal is to create PSUs of similarly sized MOS. The input file should have one row for each geographic unit, i.e. secondary sampling unit (SSU), with a PSU ID assigned. The latitude and longitude input vectors define the centroid of each input SSU. The complete linkage method for clustering is used. Accordingly, PSUs are split on a distance metric and not on the MOS threshold value. GeoDistMOS
calls the function inclusionprobabilities
from the sampling
package to calculate the inclusion probability for each SSU within a PSU and distHaversine
from the geosphere
package to calculate the distances between centroids.
A list with two components:
PSU.ID.Max.MOS |
A data frame containing the SSU ID value in character format ( |
PSU.Max.MOS.Info |
A data frame containing the new PSU ID ( |
George Zipf, Richard Valliant
GeoDistPSU
, GeoMinMOS
data(Test_Data_US)
# Create PSU ID with GeoDistPSU
g <- GeoDistPSU(Test_Data_US$lat,
Test_Data_US$long,
"miles",
100,
Input.ID = Test_Data_US$ID)
# Append PSU ID to input file
library(dplyr)
Test_Data_US <- dplyr::inner_join(Test_Data_US, g$PSU.ID, by=c("ID" = "Input.file.ID"))
# Split PSUs with MOS above 0.80
m <- GeoDistMOS(lat = Test_Data_US$lat,
long = Test_Data_US$long,
psuID = Test_Data_US$psuID,
n = 15,
MOS.var = Test_Data_US$Amount,
MOS.takeall = 0.80,
Input.ID = Test_Data_US$ID)
# Create histogram of Measure of Size Values
hist(m$PSU.Max.MOS.Info$psuID.prob,
breaks = seq(0, 1, 0.1),
main = "Histogram of PSU Inclusion Probabilities (Certainties = 1)",
xlab = "Inclusion Probability",
ylab = "Frequency")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.