GeoClean: Automated Cleaning of Geographic Coordinates

Description Usage Arguments Details Value Note References Examples

View source: R/GeoClean.R

Description

Provides a number of different tests to clean datasets with geographic coordinates. Each function argument represents a different cleaning step.

Usage

1
2
3
4
5
6
7
8
9
GeoClean(x, isna = TRUE, isnumeric = TRUE,
         coordinatevalidity = TRUE, containszero = TRUE,
	 zerozero = TRUE, zerozerothresh = 1,
	 latequallong = TRUE, GBIFhead = FALSE,
	 countrycentroid = FALSE, contthresh = 0.5,
	 capitalcoords = FALSE, capthresh = 0.5,
	 countrycheck = FALSE, polygons,
	 referencecountries= countryref,
	 outp = c("summary", "detailed", "cleaned"))

Arguments

x

a data.frame with at least three columns: “identifier” (species name), “XCOOR” (longitude) and “YCOOR” (latitude). Column names might also be “species”, “longitude” and “latitude”. If the arguments “countrycentroid”, “capitalcoords” or “countrycheck” should be used, a fourth column named “country” is needed with the country names in ISO2 or ISO3. Alternatively, a data.frame as downloaded from GBIF.

isna

logical. If TRUE, checks for missing values in the coordinates. Default = TRUE.

isnumeric

logical. If TRUE, checks for non-numeric values in the coordinates. Default = TRUE.

coordinatevalidity

logical. If TRUE, checks for non-valid coordinates (XCOOR > 180 and < -180; YCOOR >90 and <-90). Default = TRUE.

containszero

logical. If TRUE, checks for coordinates that are exactly zero. Default = TRUE.

zerozero

logical. If TRUE, checks if the coordinate fall within a rectangle around the point 0/0. Default = TRUE.

zerozerothresh

numeric. The size of the rectangle around 0/0 in decimal degrees. Default = 0.5.

latequallong

logical. If TRUE, checks for rows where XCOOR = YCOOR. Default = TRUE.

GBIFhead

logical. If TRUE, checks if the coordinate fall within a 0.5 degree rectangle around the GBIF headquarters in Copenhagen. Default = FALSE.

countrycentroid

logical. If TRUE checks if the coordinate fall within a rectangle around the centroid of the country specified in x$country. The size of the rectangle can be controlled using the "countthresh" argument. Default = FALSE.

contthresh

numeric. The size of the rectangle around the country centroid (in degrees). The number is half the length of one rectangle side. Default = 0.5.

capitalcoords

logical. If TRUE, checks if the coordinate fall within a rectangle around the capital of the country specified in x$country. The size of the rectangle can be controlled using the "countthresh" argument. Default = FALSE.

capthresh

numeric. The size of the rectangle around the capital (in degrees). The number is half the length of one rectangle side. Default = 0.5.

countrycheck

logical. If TRUE, checks if the coordinates fall within the country borders of the country indicated in x$country. Default = FALSE.

polygons

The reference polygons for the countrycheck function. By default the wrld_simpl dataset from the maptools package. The maptools package must be loaded to use countrycheck = T.

referencecountries

The reference coordinates for the country centroids and capitals. By default from the countryref data.

outp

character defining the output values. See value section.

Details

The capital and country centroids in the country ref dataset are from the CIA World Factbook. The check for country borders is based on the world_simpl data from the maptools package. Please note that the ISO2 code for Namibia (“NA”) might cause problems with the countrycheck argument. If possible use ISO3 country codes.

Value

if outp = 'summary', a vector of the same length as the input data.frame with TRUE = clean coordinates, FALSE = suspicious coordinates. If outp = 'detailed', a data.frame with one column for each check that was performed: TRUE = clean coordinates, FALSE = suspicious coordinates. If outp = 'cleaned', a cleaned version of the input data.

Note

See the speciesgeocodeR documentation for further information and examples.

References

CENTRAL INTELLIGENCE AGENCY (2014) The World Factbook, Washington, DC.

http://opengeocode.org/download/cow.php

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
data(lemurs_test)
require(maptools)

#run all tests
data(wrld_simpl)
data(countryref)
test <- GeoClean(lemurs_test, GBIFhead = TRUE,
                 countrycentroid = TRUE, contthresh = 0.5,
		 capitalcoords = TRUE, capthresh = 0.5,
		 countrycheck = FALSE, outp = "cleaned")

insidecountry <- GeoClean(test, isna = FALSE, isnumeric = FALSE,
                          coordinatevalidity = FALSE,
			  containszero = FALSE, zerozero = FALSE,
			  latequallong = FALSE, GBIFhead = FALSE,
			  countrycentroid = FALSE,
			  contthresh = 0.5, capitalcoords = FALSE,
			  capthresh = 0.5, countrycheck = TRUE,
			  polygons = wrld_simpl)
#outp = "detailed"
test <- GeoClean(lemurs_test, GBIFhead = TRUE,
                 countrycentroid = TRUE, contthresh = 0.5,
		 capitalcoords = TRUE, capthresh = 0.5,
		 countrycheck = FALSE, outp = "detailed")

speciesgeocodeR documentation built on May 30, 2017, 12:34 a.m.