Description Usage Arguments Details Value coord_pol_centroids Examples
Coordinate based cleaning
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | coord_incomplete(x, lat = NULL, lon = NULL, drop = TRUE)
coord_imprecise(x, which = "both", lat = NULL, lon = NULL, drop = TRUE)
coord_impossible(x, lat = NULL, lon = NULL, drop = TRUE)
coord_unlikely(x, lat = NULL, lon = NULL, drop = TRUE)
coord_within(
x,
field = NULL,
country = NULL,
lat = NULL,
lon = NULL,
drop = TRUE
)
coord_pol_centroids(x, lat = NULL, lon = NULL, drop = TRUE)
coord_uncertain(
x,
coorduncertainityLimit = 30000,
drop = TRUE,
ignore.na = FALSE
)
|
x |
(data.frame) A data.frame |
lat, lon |
(character) Latitude and longitude column to use. See Details. |
drop |
(logical) Drop bad data points or not. Either way, we parse out
bad data points as an attribute you can access. Default: |
which |
(character) one of "has_dec", "no_zeros", or "both" (default) |
field |
(character) Name of field in input data.frame x with country names |
country |
(character) A single country name |
coorduncertainityLimit |
(numeric) numeric threshold for the coordinateUncertainityInMeters variable. Default: 30000 |
ignore.na |
(logical) To consider NA values as a bad point or not.
Default: |
Explanation of the functions:
coord_impossible - Impossible coordinates
coord_incomplete - Incomplete coordinates
coord_imprecise - Imprecise coordinates
coord_pol_centroids - Points at political centroids
coord_unlikely - Unlikely coordinates
coord_within - Filter points within user input political boundaries
coord_uncertain - Uncertain occurrances of measured through coordinateUncertaintyInMeters default limit= 30000
If either lat or lon (or both) given, we assign the given column name to be standardized names of "latitude", and "longitude". If not given, we attempt to guess what the lat and lon column names are and assign the same standardized names. Assigning the same standardized names makes downstream processing easier so that we're dealing with consistent column names. On returning the data, we return the original names.
For coord_within
, we use countriesLow
dataset from the
rworldmap package to get country borders.
Returns a data.frame, with attributes
Right now, this function only deals with city centroids, using the maps::world.cities dataset of more than 40,000 cities. We'll work on adding country centroids, and perhaps others (e.g., counties, states, provinces, parks, etc.).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | df <- sample_data_1
# Remove impossible coordinates
NROW(df)
df[1, "latitude"] <- 170
df <- dframe(df) %>% coord_impossible()
NROW(df)
attr(df, "coord_impossible")
# Remove incomplete cases
NROW(df)
df_inc <- dframe(df) %>% coord_incomplete()
NROW(df_inc)
attr(df_inc, "coord_incomplete")
# Remove imprecise cases
df <- sample_data_5
NROW(df)
## remove records that don't have decimals at all
df_imp <- dframe(df) %>% coord_imprecise(which = "has_dec")
NROW(df_imp)
attr(df_imp, "coord_imprecise")
## remove records that have all zeros
df_imp <- dframe(df) %>% coord_imprecise(which = "no_zeros")
NROW(df_imp)
attr(df_imp, "coord_imprecise")
## remove both records that don't have decimals at all and those that
## have all zeros
df_imp <- dframe(df) %>% coord_imprecise(which = "both")
NROW(df_imp)
attr(df_imp, "coord_imprecise")
# Remove unlikely points
NROW(df)
df_unlikely <- dframe(df) %>% coord_unlikely()
NROW(df_unlikely)
attr(df_unlikely, "coord_unlikely")
# Remove points not within correct political borders
if (requireNamespace("rgbif", quietly = TRUE) && interactive()) {
library("rgbif")
wkt <- 'POLYGON((30.1 10.1,40 40,20 40,10 20,30.1 10.1))'
res <- rgbif::occ_data(geometry = wkt, limit=300)$data
} else {
res <- sample_data_4
}
## By specific country name
if (
interactive() &&
requireNamespace("sf", quietly=TRUE) &&
requireNamespace("s2", quietly=TRUE) &&
requireNamespace("rworldmap", quietly=TRUE)
) {
NROW(res)
df_within <- dframe(res) %>% coord_within(country = "Israel")
NROW(df_within)
attr(df_within, "coord_within")
## By a field in your data - makes sure your points occur in one
## of those countries
NROW(res)
df_within <- dframe(res) %>% coord_within(field = "country")
NROW(df_within)
head(df_within)
attr(df_within, "coord_within")
}
# Remove those very near political centroids
## not ready yet
# NROW(df)
# df_polcent <- dframe(df) %>% coord_pol_centroids()
# NROW(df_polcent)
# attr(df_polcent, "coord_polcent")
## lat/long column names can vary
df <- sample_data_1
head(df)
names(df)[2:3] <- c('mylon', 'mylat')
head(df)
df[1, "mylat"] <- 170
dframe(df) %>% coord_impossible(lat = "mylat", lon = "mylon")
df <- sample_data_6
# Remove uncertain occurances
NROW(df)
df1<-df %>% coord_uncertain()
NROW(df1)
attr(df, "coord_uncertain")
NROW(df)
df2<-df %>% coord_uncertain(coorduncertainityLimit = 20000)
NROW(df2)
NROW(df)
df3<-df %>% coord_uncertain(coorduncertainityLimit = 20000,ignore.na=TRUE)
NROW(df3)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.