clean_data: Load a mark-recapture data set

View source: R/clean_data.R

clean_dataR Documentation

Load a mark-recapture data set

Description

This function loads and parses a mark recapture data set. It assumes that three files are available, specifying capture, survey, and (optional) translocation data.

Usage

clean_data(
  captures,
  surveys,
  translocations = NA,
  removals = NA,
  capture_formula = ~1,
  survival_formula = ~1,
  survival_fill_value = NA
)

Arguments

captures

Data frame containing capture-recapture data. Necessary columns include 'pit_tag_id' and 'survey_date'.

surveys

Data frame containing survey data. Necessary columns include 'survey_date', 'primary_period', and 'secondary_period'. Secondary periods for which individuals are added or removed from a population should be set to zero, and must occur on their own primary period (because of the assumption that individuals cannot change states within primary periods).

translocations

Optional data frame with translocation data. Necessary columns include 'pit_tag_id' and 'release_date'. If nothing is provided to this argument, the 'clean_data' function assumes that there are no translocations of individuals into the population.

removals

Optional data frame with removal data. Necessary columns include 'pit_tag_id' and 'removal_date'. If nothing provided, 'clean_data' assumes there are no removals from the population. This can be used to account for individuals being pulled out of a population (e.g., for translocation), and for tagged indivuals whose carcasses are found.

capture_formula

An optional formula specifying the structure of survey-level capture probability covariates. Any variables in this formula must be columns in the 'surveys' data frame. The formula must start with '~' and can be provided unquoted, e.g., 'capture_formula = ~ temperature'. It is advisable to ensure that any continuous covariates provided in this formula are appropriately scaled (ideally, with mean = 0, and standard deviation = 1).

survival_formula

An optional formula specifying the structure of individual-level survival covariates. Any variables in this formula must be columns in the 'captures' data.frame, and if there are translocations, these variables must also exist as columns in the 'translocations' data.frame. The formula must start with '~' and can be provided unquoted. It is advisable to ensure that any continuous covariates provided in this formula are appropriately scaled (ideally, with mean = 0, and standard deviation = 1). Variables specified in this formula cannot be time-varying. They must be fixed for each individual over the entire study.

survival_fill_value

A fill value to use for individual-level covariates. This argument is only required when using the 'survival_formula' argument'.

Value

A list containing the data frames resulting from the capture, translocation, and survey data, along with a list of data formatted for use in a mark recapture model (with name 'stan_d').

Examples

library(mrmr)
library(readr)

captures <- system.file('extdata', 'capture-example.csv',
    package = 'mrmr') %>%
  read_csv
translocations <- system.file('extdata', 'translocation-example.csv',
    package = 'mrmr') %>%
  read_csv
surveys <- system.file('extdata', 'survey-example.csv', package = 'mrmr') %>%
  read_csv

# read and clean the data using defaults
data <- clean_data(captures, surveys, translocations)


## Not run: 
# (optional) specify a formula for detection probabilities, assuming
there is a column called "person_hours"
data <- clean_data(captures, surveys, translocations,
                   capture_formula = ~ person_hours)

## End(Not run)

SNARL1/mrmr documentation built on Nov. 23, 2023, 7:04 a.m.