make_cluster_data: Create a single data frame of field attributes from multiple...

Description Usage Arguments Value Note

View source: R/preprocessing.R

Description

Create a single data frame of field attributes from multiple files to use in clustering

Usage

1

Arguments

config

list; a named list containing all the needed inputs. The following must be included:

  • path string; directory where the files are stored

  • files string; names of files to be opened within the directory given by path.

  • file_ids string; names to assign to new columns. Must be in the same order as files.

  • grid_field_name string; the file_id of the field that should be used to create the field grid. This should be a file that is representative because it will be used to make the boundary and grid applied to all other fields.

  • var_of_interest string; column name in each file that should be retained in the final combined data frame.

  • harvest_width numeric; width of harvest header, in meters.

  • alpha numeric; parameter that controls the level of simplication in the field boundary. Larger numbers are more simple and follow the data points less closely. This parameter is passed to alphahull::ashape.

  • passes_to_clip integer; the number of harvest passes to clip when creating field buffer.

  • cellsize_scaler numeric; controls the size of the grid cells. The value of combind_width * cellsize_scaler is passed to sf::st_make_grid.

  • output_path string; optional, if provided, plots will be saved to the directory given by this path. This can be helpful because some output plots are large and slow to load in the graphics device.

plot

logical; **optional** should a faceted plot of config$var_of_interest be plotted? Default is TRUE. Note: this only controls whether a plot should be created in the current R Studio graphics device. plot = FALSE will NOT surpress saving plots if an output path has been given in the config list.

Value

An sf data frame containing the requested columns from each file, named as paste(var_of_interest, file_ids, sep = _), aggregated to a common grid of hexagonal polygons of size harvest_width * cellsize_scaler, clipped to the size of the field buffer. The field buffer is the detected field boundardy, with its simplifiction controlled by alpha, minus harvest_width * passes_to_clip. The value of each polygon represents the median of the underlying point observations that fell within each polygon in the grid.

Note

TODO can you pass multiple variable names to var_of_interest? TODO a checking function that makes sure all inputs are correct before loading the files.


smmueller/plotdesignr documentation built on Jan. 5, 2022, 10:55 a.m.