load_raw_all: Load and combine raw data files

View source: R/load.R

load_raw_allR Documentation

Load and combine raw data files

Description

This is a wrapper function which loads and combines raw data files. If r_dir is specified, these include all files in series of nested folders, if r_list is specified it includes only the list of files specified.

Usage

load_raw_all(
  r_dir,
  r_list,
  pattern = "DATA",
  tz = Sys.timezone(),
  tz_disp = NULL,
  dst = FALSE,
  details = 1,
  logger_pattern = NA,
  time_format = "mdy HMS",
  extra_pattern = NULL,
  extra_name = NULL,
  sep = "",
  skip = 0,
  verbose = TRUE,
  feeder_pattern
)

Arguments

r_dir

Character. The director that holds all your raw data files (can be in subdirectories).

r_list

Character. A list of files to import.

pattern

Character. A regular expression pattern that matches the files you wish to include. Defaults to "DATA" to include only DATA files and not NOTE files.

tz

Character. The time zone the date/times are in (should match one of the zones produced by OlsonNames()). Attempts to use user's system timezone, if none supplied. Defaults to UTC if all else fails.

tz_disp

Character. The time zone the date/times should be displayed in (if not the same as tz; should match one of the zones produced by OlsonNames()).

dst

Logical. Whether or not to use Daylight Savings. When set to FALSE timezones are converted to the Etc/GMT+X timezones which do not include DST. (Note this overrides the timezone specification such that a timezone of America/Vancouver, which would normally include DST in the summer, will be transformed to a timezone with the same GMT offset, but not including DST).

details

Numeric. Where to find logger details, either 0 (file name), 1 (first line) or 2 (first two lines). See 'details'.

logger_pattern

Character. A regular expression matching the logger id in the file name. NA (default) matches file name (extension omitted) or first line of the file (See the details argument). Alternatively, [GPR]\{2,3\}[0-9]\{1,2\} would match the names of TRU loggers.

time_format

Character. The date/time format of the 'date' and 'time' columns combined. Defaults to "mdy HMS". Should be in formats usable by the parse_date_time() function from the lubridate package (e.g., "ymd HMS", "mdy HMS", "dmy HMS", etc.). See details for more information.

extra_pattern

Character vector. A vector of regular expressions matching any extra information in the file or directory names.

extra_name

Character vector. A vector of column names matching the order of extra_pattern for storing the results of the pattern.

sep

Character. An override for the separator in the read.table() call (see sep = under ?read.table for more details).

skip

Character. Extra lines to skip in addition to the lines specified by details.

verbose

Logical. Whether to include progress messages or not.

feeder_pattern

Deprecated. Use logger_pattern instead.

Details

Note that if both r_dir and r_list are specified, the directory overrides the file list.

Each data file is assumed to contain three columns (without column names) corresponding to animal_id, date and time (without date). By default they are expected to be separated by white space, but the sep argument can be modified to reflect other separators, such as comma- or tab-separated data.

The columns date and time will be combined to extract the date/time of each event. Thus, the time_format argument specifies the order of the combined date and time columns and should be in formats usable by the lubridate::parse_date_time() function from the lubridate package (e.g., "ymd HMS", "mdy HMS", "dmy HMS", etc.). For example, the default "mdy HMS" expects a date column in the format of month/day/year and a time column in the format of H:M:S (note that separators and leading zeros are ignored, thus month-day-year is equivalent to month/day/year, see the order argument of the parse_date_time function for more information. More complex formats can also be specified: For example, 09/30/16 2:00 pm can be specified by time_format = "mdy HM p".

Logger details are the logger_id and the lat/lon for the logger. A value of 0 reflects that the logger_id is in the file name, defined by the pattern logger_pattern. A value of 1 reflects that the logger_id is in the first line of the file, also defined by the pattern logger_pattern. A value of 2 reflects that in addition to the logger_id being in the first line ofthe file, the lat/lon information is on the second line, in the format of "latitude, longitude" both in decimal format (spacing doesn't matter, but the comma does).


animalnexus/feedr documentation built on Feb. 2, 2023, 1:12 a.m.