GENEAread-package: A package to process binary accelerometer output files.
In GENEAread: Package for Reading Binary Files

GENEAread-package

R Documentation

A package to process binary accelerometer output files.

Description

This is a package to process binary output files from the GENEA accelerometer data.

The main functions are:

read.bin
stft
epoch

A function to process binary accelerometer files and convert the information into R objects.

Usage

read.bin(
  binfile,
  outfile = NULL,
  start = NULL,
  end = NULL,
  Use.Timestamps = FALSE,
  verbose = TRUE,
  do.temp = TRUE,
  do.volt = TRUE,
  calibrate = TRUE,
  downsample = NULL,
  blocksize,
  virtual = FALSE,
  mmap.load = (.Machine$sizeof.pointer >= 8),
  pagerefs = TRUE,
  ...
)

Arguments

`binfile`	A filename of a file to process.
`outfile`	An optional filename specifying where to save the processed data object.
`start`	Either: A representation of when in the file to begin processing, see Details.
`end`	Either: A representation of when in the file to end processing, see Details.
`Use.Timestamps`	To use timestamps as the start and end time values this has to be set to TRUE. (Default FALSE)
`verbose`	A boolean variable indicating whether some information should be printed during processing should be printed.
`do.temp`	A boolean variable indicating whether the temperature signal should be extracted
`do.volt`	A boolean variable indicating whether the voltage signal should be extracted.
`calibrate`	A boolean variable indicating whether the raw accelerometer values and the light variable should be calibrated according to the calibration data in the headers.
`downsample`	A variable indicating the type of downsampling to apply to the data as it is loaded. Can take values: NULL: (Default) No downsampling Single numeric: Reads every downsample-th value, starting from the first. Length two numeric vector: Reads every downsample[1]-th value, starting from the downsample[2]-th. Non-integer, or non-divisor of 300 downsampling factors are allowed, but will lead to imprecise frequency calculations, leap seconds being introduced, and generally potential problems with other methods. Use with care.
`blocksize`	Integer value giving maximum number of data pages to read in each pass. Defaults to 10000 for larger data files. Sufficiently small sizes will split very large data files to read chunk by chunk, reducing memory requirements for the read.bin function (without affecting the final object), but conversely possibly increasing processing time. Can be set to Inf for no splitting.
`virtual`	logical. If set TRUE, do not do any actual data reading. Instead construct a VirtualAccData object containing header information to allow use with get.intervals
`mmap.load`	Default is (.Machine$sizeof.pointer >= 8). see `mmap` for more details
`pagerefs`	A variable that can take two forms, and is considered only for `mmap.load = TRUE` NULL or FALSE, in which case pagerefs are dynamically calculated for each record. (Default) A vector giving sorted byte offsets for each record for mmap reading of data files. TRUE, in which case a full page reference table is computed before any processing occurs. Computing pagerefs takes a little time and so is a little slower. However, it is safer than dynamic computations in the case of missing pages and high temperature variations. Further, once page references are calculated, future reads are much faster, so long as the previously computed references are supplied.
`...`	Any other optional arguments can be supplied that affect manual calibration and data processing. These are: mmap: logical. If TRUE (Default on 64bit R), use the mmap package to process the binfile gain: a vector of 3 values for manual gain calibration of the raw (x,y,z) axes. If gain=NULL, the gain calibration values are taken from within the output file itself. offset: a vector of 3 value for manual offset calibration of the raw (x,y,z) axes. If offset=NULL, the offset calibration values are taken from within the output file itself. luxv: a value for manual lux calibration of the light meter. If luxv=NULL, the lux calibration value is taken from within the output file itself. voltv: a value for manual volts calibration of the light meter. If voltv=NULL, the volts calibration value is taken from within the output file itself. warn: if set to true, give a warning if input file is large, and require user confirmation.

Details

The main tasks performed by the package are listed below. The relevant topic contains documentation and examples for each.

Extraction of file header material is accomplished by header.info.
Input and downsampling of data is accomplished by read.bin.
Selection of time intervals is accomplished via get.intervals.
Computation of epochal summaries is accomplished by epoch and other functions documented therein.
Computation of STFT analyses is accomplished by stft.

The package provides definitions and methods for the following S3 classes:

GRtime: Provides numeric storage and streamlined plotting for times. GRtime
AccData: Stores GENEA accelerometer data, allowing plotting, subsetting and other computation.AccData
VirtAccData: A virtual AccData object, for just-in-time data access via get.intervals.
stft: Processed STFT outputs, for plotting via plot.stft.

The read.bin package reads in binary files compatible with the GeneActiv line of Accelerometers, for further processing by the other functions in this package. Most of the default options are those required in the most common cases, though users are advised to consider setting start and end to smaller intervals and/or choosing some level of downsampling when working with data files of longer than 24 hours in length.

The function reads in the desired analysis time window specified by start and end. For convenience, a variety of time window formats are accepted:

Large integers are read as page numbers in the dataset. Page numbers larger than that which is available in the file itself are constrained to what is available. Note that the first page is page 1. Small values (between 0 and 1) are taken as proportions of the data. For example, ‘start = 0.5‘ would specify that reading should begin at the midpoint of the data. Strings are interpreted as dates and times using parse.time. In particular, times specified as "HH:MM" or "HH:MM:SS" are taken as the earliest time interval containing these times in the file. Strings with an integer prepended, using a space seperator, as interpreted as that time after the appropriate number of midnights have passed - in other words, the appropriate time of day on the Nth full day. Days of the week and dates in "day/month", "day/month/year", "month-day", "year-month-day" are also handled. Note that the time is interpreted in the same time zone as the data recording itself.

Actual data reading proceeds by two methods, depending on whether mmap is true or false. With mmap = FALSE, data is read in line by line using readLine until blocksize is filled, and then processed. With mmap = TRUE, the mmap package is used to map the entire data file into an address file, byte locations are calculated (depending on the setting of pagerefs), blocksize chunks of data are loaded, and then processed as raw vectors.

There are advantages and disadvantages to both methods: the mmap method is usually much faster, especially when we are only loading the final parts of the data. readLines will have to process the entire file in such a case. On the other hand, mmap requires a large amount of memory address space, and so can fail in 32 bit systems. Finally, reading of compressed bin files can only be done with the readLine method. Generally, if mmap reading fails, the function will attempt to catch the failure, and reprocess the file with the readLine method, giving a warning. Once data is loaded, calibration is then either performed using values from the binary file, or using manually inputted values (using the gain, offset,luxv and voltv arguments).

Main tasks performed

Classes implemented

WARNING

Reading in an entire .bin file will take a long time if the file contains a lot of datasets. Reading in such files without downsampling can use up all available memory. See memory.limit. This function is specific to header structure in GENEActiv output files. By design, it should be compatible with all firmware and software versions to date (as of version of current release). If order or field names are changed in future .bin files, this function may have to be updated appropriately.

Author(s)

Zhou Fang <zhou@activinsights.co.uk>

Activinsights Ltd. <joss.langford@activinsights.co.uk>

Charles Sweetland <charles@sweetland-solutions.co.uk>

Examples

requireNamespace("GENEAread")
binfile = system.file("binfile/TESTfile.bin", package = "GENEAread")[1]
#Read in the entire file, calibrated
procfile <- read.bin(binfile)
# print(procfile)
# procfile$data.out[1:5,]
# Uncalibrated, mmap off
procfile2 <- read.bin(binfile, calibrate = FALSE)
# procfile2$data.out[1:5,]
#Read in again, reusing already computed mmap pagerefs
# procfile3 <- read.bin(binfile, pagerefs = procfile2$pagerefs )
#Downsample by a factor of 10
procfilelo<-read.bin(binfile, downsample = 10)
# print(procfilelo)
object.size(procfilelo) / object.size(procfile)
#Read in a 1 minute interval
procfileshort <- read.bin(binfile, start = "16:50", end = "16:51")
# print(procfileshort)
##NOT RUN: Read, and save as a R workspace
#read.bin(binfile, outfile = "tmp.Rdata")
#print(load("tmp.Rdata"))
#print(processedfile)

GENEAread documentation built on June 22, 2024, 12:16 p.m.