# tidyverse: Tidyverse methods for sf objects (remove .sf suffix!) In sf: Simple Features for R

## Description

Tidyverse methods for sf objects. Geometries are sticky, use as.data.frame to let `dplyr`'s own methods drop them. Use these methods without the .sf suffix and after loading the tidyverse package with the generic (or after loading package tidyverse).

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83``` ```filter.sf(.data, ..., .dots) arrange.sf(.data, ..., .dots) group_by.sf(.data, ..., add = FALSE) ungroup.sf(x, ...) mutate.sf(.data, ..., .dots) transmute.sf(.data, ..., .dots) select.sf(.data, ...) rename.sf(.data, ...) slice.sf(.data, ..., .dots) summarise.sf(.data, ..., .dots, do_union = TRUE) distinct.sf(.data, ..., .keep_all = FALSE) gather.sf( data, key, value, ..., na.rm = FALSE, convert = FALSE, factor_key = FALSE ) spread.sf( data, key, value, fill = NA, convert = FALSE, drop = TRUE, sep = NULL ) sample_n.sf(tbl, size, replace = FALSE, weight = NULL, .env = parent.frame()) sample_frac.sf( tbl, size = 1, replace = FALSE, weight = NULL, .env = parent.frame() ) nest.sf(.data, ...) separate.sf( data, col, into, sep = "[^[:alnum:]]+", remove = TRUE, convert = FALSE, extra = "warn", fill = "warn", ... ) separate_rows.sf(data, ..., sep = "[^[:alnum:]]+", convert = FALSE) unite.sf(data, col, ..., sep = "_", remove = TRUE) unnest.sf(data, ..., .preserve = NULL) inner_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) left_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) right_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) full_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) semi_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) anti_join.sf(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) ```

## Arguments

 `.data` data object of class sf `...` other arguments `.dots` see corresponding function in package `dplyr` `add` see corresponding function in dplyr `x` A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details. `do_union` logical; in case `summary` does not create a geometry column, should geometries be created by unioning using st_union, or simply by combining using st_combine? Using st_union resolves internal boundaries, but in case of unioning points, this will likely change the order of the points; see Details. `.keep_all` see corresponding function in dplyr `data` see original function docs `key` see original function docs `value` see original function docs `na.rm` see original function docs `convert` see separate_rows `factor_key` see original function docs `fill` see original function docs `drop` see original function docs `sep` see separate_rows `tbl` see original function docs `size` see original function docs `replace` see original function docs `weight` see original function docs `.env` see original function docs `col` see separate `into` see separate `remove` see separate `extra` see separate `.preserve` see unnest `y` A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details. `by` A character vector of variables to join by. If `NULL`, the default, `*_join()` will perform a natural join, using all variables in common across `x` and `y`. A message lists the variables so that you can check they're correct; suppress the message by supplying `by` explicitly. To join by different variables on `x` and `y`, use a named vector. For example, `by = c("a" = "b")` will match `x\$a` to `y\$b`. To join by multiple variables, use a vector with length > 1. For example, `by = c("a", "b")` will match `x\$a` to `y\$a` and `x\$b` to `y\$b`. Use a named vector to match different variables in `x` and `y`. For example, `by = c("a" = "b", "c" = "d")` will match `x\$a` to `y\$b` and `x\$c` to `y\$d`. To perform a cross-join, generating all combinations of `x` and `y`, use `by = character()`. `copy` If `x` and `y` are not from the same data source, and `copy` is `TRUE`, then `y` will be copied into the same src as `x`. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it. `suffix` If there are non-joined duplicate variables in `x` and `y`, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

## Details

`select` keeps the geometry regardless whether it is selected or not; to deselect it, first pipe through `as.data.frame` to let dplyr's own `select` drop it.

In case one or more of the arguments (expressions) in the `summarise` call creates a geometry list-column, the first of these will be the (active) geometry of the returned object. If this is not the case, a geometry column is created, depending on the value of `do_union`.

In case `do_union` is `FALSE`, `summarise` will simply combine geometries using c.sfg. When polygons sharing a boundary are combined, this leads to geometries that are invalid; see for instance https://github.com/r-spatial/sf/issues/681.

`distinct` gives distinct records for which all attributes and geometries are distinct; st_equals is used to find out which geometries are distinct.

`nest` assumes that a simple feature geometry list-column was among the columns that were nested.

## Value

an object of class sf

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37``` ```library(dplyr) nc = st_read(system.file("shape/nc.shp", package="sf")) nc %>% filter(AREA > .1) %>% plot() # plot 10 smallest counties in grey: st_geometry(nc) %>% plot() nc %>% select(AREA) %>% arrange(AREA) %>% slice(1:10) %>% plot(add = TRUE, col = 'grey') title("the ten counties with smallest area") nc\$area_cl = cut(nc\$AREA, c(0, .1, .12, .15, .25)) nc %>% group_by(area_cl) %>% class() nc2 <- nc %>% mutate(area10 = AREA/10) nc %>% transmute(AREA = AREA/10, geometry = geometry) %>% class() nc %>% transmute(AREA = AREA/10) %>% class() nc %>% select(SID74, SID79) %>% names() nc %>% select(SID74, SID79, geometry) %>% names() nc %>% select(SID74, SID79) %>% class() nc %>% select(SID74, SID79, geometry) %>% class() nc2 <- nc %>% rename(area = AREA) nc %>% slice(1:2) nc\$area_cl = cut(nc\$AREA, c(0, .1, .12, .15, .25)) nc.g <- nc %>% group_by(area_cl) nc.g %>% summarise(mean(AREA)) nc.g %>% summarise(mean(AREA)) %>% plot(col = grey(3:6 / 7)) nc %>% as.data.frame %>% summarise(mean(AREA)) nc[c(1:100, 1:10), ] %>% distinct() %>% nrow() library(tidyr) nc %>% select(SID74, SID79) %>% gather("VAR", "SID", -geometry) %>% summary() library(tidyr) nc\$row = 1:100 # needed for spread to work nc %>% select(SID74, SID79, geometry, row) %>% gather("VAR", "SID", -geometry, -row) %>% spread(VAR, SID) %>% head() storms.sf = st_as_sf(storms, coords = c("long", "lat"), crs = 4326) x <- storms.sf %>% group_by(name, year) %>% nest trs = lapply(x\$data, function(tr) st_cast(st_combine(tr), "LINESTRING")[]) %>% st_sfc(crs = 4326) trs.sf = st_sf(x[,1:2], trs) plot(trs.sf["year"], axes = TRUE) ```

sf documentation built on July 1, 2020, 5:13 p.m.