source("setup.R")

# Copy data files to website dir:
system("cp ../inst/extdata/raw/exp2/*.TXT ./Data/Raw/exp2/")
system("cp ../inst/extdata/raw/exp3/*.TXT ./Data/Raw/exp3/")

system("cp ../inst/extdata/animal_index.csv ./Data/")
system("cp ../inst/extdata/chickadees_logger_index.csv ./Data/")
system("cp ../inst/extdata/problems.csv ./Data/")

This is a quick tutorial to get you started with loading in your own data. feedr includes several wrapper functions that can be used to load and format your data.

A note about file structure

It's important to remember that when specifying file locations, you must either specify a complete file location (e.g. "/home/steffi/Desktop/data.csv") or an appropriate relative file location. A relative file location would be something like: "./Data/data.csv" which points to a file called data.csv which is in a folder called "Data". The "./" indicates that "Data" is in the current working directory. If we used "../" that would indicate that "Data" was one directory up.

Also, remember that file locations are relative to where R's working directory is, and this is not necessarily the same place as the R script with which you are working.

If you are using RStudio, it is highly recommended that you specify an RStudio project in the directory which holds your scripts. This way, anytime you open the file, the working directly is automatically set to your script directly.

This tutorial assumes that your data is stored in a folder called "Data" which is in turn stored in your R scripts folder.

load_raw()

This loads and formats a raw data file downloaded directly from RFID loggers setup in the same manner as the Thompson Rivers University loggers.

In the raw form, this data looks like:

cat(readLines("./Data/Raw/exp2/GR10DATA_2016_01_16.TXT", n=6), sep = '\n')

When we use load_raw() to import it, we get this:

r1 <- load_raw("./Data/Raw/exp2/GR10DATA_2016_01_16.TXT")
head(r1)

Note that the logger_id has been extracted from the first line of the file. This is done by matching an expected pattern against the the contents of the first line (these patterns are called 'Regular Expressions').

The default pattern matches GR or GPR followed by 1 or 2 digits. If you need to specify a different pattern you can do so. For example, if your loggers were labeled "Logger_10" or "Logger_01":

system("cp ./Data/Raw/exp2/GR10DATA_2016_01_16.TXT ./Data/Raw/exp2/Logger_10_2016_01_16.TXT")
system("sed -i '1s/GR10DATA/Logger_10/' ./Data/Raw/exp2/Logger_10_2016_01_16.TXT")
cat(readLines("./Data/Raw/exp2/Logger_10_2016_01_16.TXT", n=6), sep = '\n')
r1 <- load_raw("./Data/Raw/exp2/Logger_10_2016_01_16.TXT", logger_pattern = "Logger_[0-9]{2}")
head(r1)

Details

Alternatively, you can determine where load_raw() gets extra details from by specifying the details argument (defaults to details = 1).

system("cp ./Data/Raw/exp2/GR10DATA_2016_01_16.TXT ./Data/Raw/exp2/Logger_Data.TXT")
system("sed -i '1s/GR10DATA/Logger_21/' ./Data/Raw/exp2/Logger_Data.TXT")

For example:

1. details = 0

logger_id is in the file name, defined by the pattern logger_pattern (If logger_id is also the first line, need to skip that first line).

For example, if the file looks like:

cat(readLines("./Data/Raw/exp2/Logger_Data.TXT", n=6), sep = '\n')

Then this pattern won't work because it matches the first line, not the file name:

load_raw("./Data/Raw/exp2/Logger_Data.TXT", logger_pattern = "Logger_[0-9]{2}", details = 0, skip = 1)

But this pattern will work:

load_raw("./Data/Raw/exp2/Logger_Data.TXT", logger_pattern = "Logger", details = 0, skip = 1)
load_raw("./Data/Raw/exp2/Logger_Data.TXT", logger_pattern = "Logger", details = 0, skip = 1)[1:6,]

2. details = 1

logger_id is in the first line of the file, also defined by the pattern logger_pattern

load_raw("./Data/Raw/exp2/Logger_Data.TXT", logger_pattern = "Logger_[0-9]{2}", details = 1)
load_raw("./Data/Raw/exp2/Logger_Data.TXT", logger_pattern = "Logger_[0-9]{2}", details = 1)[1:6,]

3. details = 2

logger_id is in the first line of the file, and lat/lon information is on the second line, in the format of "latitude, longitude" both in decimal format (spacing doesn't matter, but the comma does):

Such a file would look like this:

system("sed -i '1s/Logger_21/Logger_21\\\n53.89086,-122.81933/' ./Data/Raw/exp2/Logger_Data.TXT")
cat(readLines("./Data/Raw/exp2/Logger_Data.TXT", n=6), sep = '\n')

And would be loaded by load_raw() like this:

load_raw("./Data/Raw/exp2/Logger_Data.TXT", logger_pattern = "Logger_[0-9]{2}", details = 2)
load_raw("./Data/Raw/exp2/Logger_Data.TXT", logger_pattern = "Logger_[0-9]{2}", details = 2)[1:6, ]

For more information on how to write Regular Expression patterns, see documentation for "Regular Expressions" (e.g. http://www.regular-expressions.info/tutorial.html)

load_raw_all()

The function load_raw_all() is a wrapper function which will automatically load and combine data contained in several different files in a single folder, or in a nested series of folders. Other files can be present, but all data files must be identifiable by a pattern (pattern argument) in the file name.

In this example our data files are stored in a folder called raw and there are several sets of data, each corresponding to an individual experiment which are then stored in their own folder called exp1, exp2, etc. Logger data files are identifiable by the characters 'DATA' present in the name (as in the above example), which is the default pattern:

r <- load_raw_all(r_dir = "./Data/Raw")
head(r)
summary(r)

(Note that empty files are skipped, but identified)

If your logger files don't have an identifiable label, but are the only csv files in the folders, you could use:

r <- load_raw_all(r_dir = "./Data/Raw", pattern = ".csv")

Extra details

In this example we have several different experiments, which we'll probably want to identify in our data. This is where the extra_ arguments come in.

list.files("./Data/Raw")

In our example, each experiment is stored in its own folder ('exp2' and 'exp3'). Therefore we can tell our function to identify patterns (extra_pattern) in the file names and store the values in an extra column (extra_name):

r <- load_raw_all(r_dir = "./Data/Raw", extra_pattern = "exp[2-3]{1}", extra_name = "experiment")
head(r)

"exp[1-2]{1}" matches the exact characters "exp" followed by either a 1 or a 2 of which there is exactly 1. The values matching this pattern are then stored in a new column called 'experiment'.

Because here the loggers were RFID-enabled feeders reused for different experiments, some logger have the same id, but a different lat/lon. However, logger_ids need to be unique or we will have problems later on, so let's create unique logger_id names:

r$logger_id <- paste(r$experiment, r$logger_id, sep = "-")
head(r)

Logger details

Because raw logger data doesn't include logger specific details, we should probably include some extra data for use later (visualizations, etc.):

We can do this the same way as we did above (see 'Details' under load_raw()), or by merging our raw data with a logger index file. This method works best when you have multiple details you'd like to add (more than just lat/lon information), or when your raw files don't contain lat/lon information already.

## Open logger index
l_index <- read.csv("./Data/chickadees_logger_index.csv")
head(l_index)
l_index$logger_id <- paste(l_index$experiment, l_index$logger_name, sep = "-")

## Merge logger index into RFID data, matching 'experiment' and 'logger_id'
r <- merge(r, l_index, by = c("experiment", "logger_id"))
head(r)

This data is now ready for housekeeping or go straight to transformations!

ui_import()

ui_import() is a helper function that launches a stand-alone shiny app to give you a user-interface for importing data.

library(feedrUI)

my_imported_data <- ui_import()


Back to top
Go back to home | Continue with housekeeping



animalnexus/feedr documentation built on Feb. 2, 2023, 1:12 a.m.