This package aims at providing an efficient workflow for extracting, formatting and analyzing data from RFID readers in the context of studying animal behaviour in experimental settings. A schematic version of this vignette is available in the StalkeR cheat sheet. The package allows researchers to compute several behavioural indices for each individual.
First, you can (install and) load the StalkeR package.
library(pondr)
Before starting to use the StalkeR package functions, you have to format and organise your data in a certain way. The unit of the StalkeR package is a data frame containing reads for one experimental block. This means that in the data frame, every row is one read, containing information about the individual ID, the antenna and the exact time of the read. An experimental block represents a unit limited in space and time, with associated antennas. For instance, if you work with birds, an experimental block can be one enclosure with several animals. If you work with fishes, an experimental block might be a pond/aquarium containing a group of fish. To obtain reads from one experimental block, we suggest to use the following workflow.
First, you should build the a experimental block *reference data frame . This object contains a list of all the individuals of one experimental block. In addition, several covariates can be added to the data frame, e.g. sex, size or experimental treatment. The data frame must contain at least the id* columns with the individuals identification code/number.
# I import the example data frame data(block_ref_df) # I show the first rows of this example head(block_ref_df)
Then, depending on your experimental set-up, there are two options: (i) the data frame(s) you obtain from your RFID readers contain reads already corresponding to experimental blocks (one data frame per experimental block); (ii) the data frame(s) you obtain from your RFID readers contain reads from several experimental blocks (one data frame for several experimental block); several data frames obtained from your RFID readers constitute a single experimental block.
Eventually, you want to obtain one data frame per experimental block. Thus, if you are in the first case, you can move to the 'Format raw data' section below directly. If however you are in the second or the third case, you will need to subset of assemble the reads corresponding to your different experimental blocks first. We suggest the use of the merge() or the rbind() functions, for example.
You must be now working with a data frame containing reads from one experimental block. This data frame has to contain at least four different columns: id, date, time and antenna. The id column must contain a unique number/code per individual; the date column must contain the data at the dd-mm-yy format; the time column must contain the time in the hh:mm:ss format; the column must contain a unique antenna number/code. Be sure to name these columns as in the example below, keeping in mind that R is case sensitive (no capital letters in columns name).
# I import the example data frame data(raw_df) # I show the first rows of this example head(raw_df)
Now that block_ref_df
and raw_df
are ready, you can start to use the different package functions.
The first function (pr_clean_data) has to be used before using any of the next StalkeR functions. It uses the block_ref_df
and raw_df
objects as inputs, and outputs a cleaner version of raw_df
named block_df
. The function does the following actions:
block_ref_df
object.The output block_df
, is the working unit of this package.
{width="50%"}
Example:
block_df <- pr_clean_data(raw_df, block_ref_df) head(block_df)
Once you generated block_df
, you can run a set of functions to compute different behavioural indices. The functions are described below:
The pr_summary uses, as input, the reads from an experimental block (block_df
) and a reference list of fish present in this block (block_ref_df
). It produces:
Why do we need both block_df
and block_ref_df
? If all the individuals have been read in block_df
, then the individuals present in block_ref_df
and block_df
are the same. However, if some individuals haven't been read in block_df
(e.g. animals haven't moved, lost the tag, are dead), then there will be more individuals in the block_ref_df
.
{width="100%"}
Example:
# I run the function data_summary <- pr_summary(block_df = block_df, block_ref_df = block_ref_df) # I create the summary table object summary_table <- data_summary$df head(summary_table) # I create a heatmap showing the number of reads per individual summary_heatmap <- data_summary$heatmap summary_heatmap
The pr_sociality computes, for every individual, the proportion of time it was read in presence of at least one conspecific. To do so, the function divides the number of accompanied reads (number of reads for which the focal individual was read at the same second and at the same antenna than at least another individual) by the total number of reads (accompanied plus non-accompanied reads).
pr_sociality uses three different inputs: block_df
, block_ref_df
and cutoff
. The cutoff
argument is a value corresponding to the minimum number of total reads from which the sociality index (that is, the proportion of accompanied reads) should be computed. For example, if an individual was read only once, by chance, in presence of a conspecific, a score of 100% for this individual is probably not representative of the real proportion of time it spends in presence of conspecifics. By default, the value if set at 10 total reads. For no cutoff, one can use a value of 0.
{width="100%"}
Example:
sociality_table <- pr_sociality(block_df = block_df, block_ref_df = block_ref_df, cutoff = 15) head(sociality_table)
This function computes, for every individual, the latency to reach (i.e. be read at) certain antennas. This can for example be used in the context of an exploration test, during which individuals enter an exploration box, in which they are read by an antenna.
The function computes the difference between the initial time (time at which the antennas were introduced/started to record) and the first read at one or several antennas of interest. Individuals that were not read at these antennas receive a score corresponding to the maximal possible duration by default.
In addition to block_df
and block_ref_df
, the function needs a series of additional arguments:
antenna_nb
: If there is one antenna of interest, then antenna_nb
corresponds to its number/name. However, if there are several antennas of interest, the argument should be a vector of antenna names/numbers.
start_time
: The time from which the experimental block starts (i.e. POSIXct format, see examples).
end_time
: The time until which the experimental block lasts (i.e. POSIXct format, see examples).
keep_NA
: Logical argument. If the individuals are never read by the antenna(s), the latency might be set as the maximal possible value, i.e. end_time
- start_time
(default option), or be kept as NA.
unit
: Unit of the latency duration: 'm' as default.
{width="100%"}
Example 1, one antenna of interest:
# Antenna of interest antenna_nb_1 <- 44 # Start and end time start_time <- as.POSIXct(strptime(c("2020-11-05 12:30:00"), "%Y-%m-%d %H:%M:%OS"), "UTC") end_time <- as.POSIXct(strptime(c("2020-11-05 15:00:00"), "%Y-%m-%d %H:%M:%OS"), "UTC") # I run the function ant_44_table <- pr_latency_reach( block_df = block_df, block_ref_df = block_ref_df, antenna_nb_1, start_time = start_time, end_time = end_time, keep_NA = T, unit = 's') head(ant_44_table)
Example 2, several antennas of interest:
# Antennas of interest antenna_nb_2 <- c(41, 42, 43, 44) # I run my function mult_ant_table <- pr_latency_reach( block_df = block_df, block_ref_df = block_ref_df, antenna_nb_2, start_time = start_time, end_time = end_time, unit = 'm') head(mult_ant_table)
This function computes, for every individual, the latency to cross a sequence of antennas consecutively. For example if you want to know how long an individual took to go from pond A to pond B from the start of the experiment, you could install two antennas at each side of a connector (e.g. 41 and 42), you could then obtain the latency to be consecutively read at 41 then 42. In addition to the latency to go from pond A to pond B, you might be interested in knowing how long an individual took to go from pond B to pond C (e.g. antenna 43) - in this case, you can indicate several sequences of antennas of interest (e.g. 41-42 and 42-43).
Specifically, the function computes the difference between the initial time (time at which the antennas were introduced/started to record) and the first read at one antenna of the indicated sequence(s). More details are given below. Individuals that were not read at these antennas receive a score corresponding to the maximal possible duration by default.
In addition to block_df
and block_ref_df
, the function needs a series of additional arguments:
sequence
: Either a vector of antenna numbers/names, or a data frame containing several sequences of the same length (one vector per row), see examples.
start_time
: The time from which the experimental block starts (i.e. POSIXct format, see examples).
end_time
: The time until which the experimental block lasts (i.e. POSIXct format, see examples).
seq_position
: Which antenna, from the sequence of antennas, should be crossed by an animal to calculate the latency? By default, the first read at the second antenna is used to calculate the latency (seq_position
= 2). When seq_position
= 1, the value will correspond to the first read at the first antenna of the sequence, which might have occurred long before the animal actually crossed the sequence. For example, if the sequence is c(1, 2), the individual can have been read at antenna 1 at 12:00:00, and only cross the sequence later (cross antenna 1 at 12:45:03 and cross antenna 2 at 12:45:05). In the later case, if seq_position
= 1, the latency will be computed with 12:00:00, and if seq_position
= 2, the latency will be computed with 12:45:05.
keep_NA
: Logical argument. If the individuals are never read by the antenna(s), the latency might be set as the maximal possible value, i.e. end_time
- start_time
(default option), or be kept as NA.
unit
: Unit of the latency duration: 'm' as default.
{width="100%"}
Example 1, one sequence of interest :
# Sequence of interest sequence_1 <- c(41, 42) # Start and end time start_time <- as.POSIXct(strptime(c("2020-11-05 12:30:00"), "%Y-%m-%d %H:%M:%OS"), "UTC") end_time <- as.POSIXct(strptime(c("2020-11-05 15:00:00"), "%Y-%m-%d %H:%M:%OS"), "UTC") # I run the function pr_latency_cross(block_df, block_ref_df, sequence_1, start_time, end_time)
Example 2, several sequences of interest :
# Sequences of interest sequence_2 <- as.data.frame(rbind( c(41, 42), c(42, 43) )) # I run the function pr_latency_cross(block_df, block_ref_df, sequence_2, start_time, end_time, keep_NA = TRUE, unit = 's')
This function computes, for every individual, the euclidean distance it has travelled among two or more antennas. It can for example be useful as a proxy for activity level. The output is a data frame containing two columns: "id" and "distance", and indicates the travelled distance for all individuals.
In addition to block_df
and block_ref_df
, the function needs one additional argument: ant_coordinates
. ant_coordinates
is a data frame containing the coordinates of the antennas of interest. It contains four columns: "antenna","x","y","z" (where "x", "y", and "z" represent the coordinates in three dimensions). If there is no z axis, enter all zeros but do not omit the column. Every row corresponds to one antenna.
{width="100%"}
Example:
# I load the coordinates of four antennas data(ant_coordinates) ant_coordinates
I run the function given the coordinates indicated in ant_coordinates
.
pr_dist(block_df = block_df, block_ref_df = block_ref_df, ant_coordinates = ant_coordinates)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.