read_audio: Import audio data

View source: R/read.R

read_audioR Documentation

Import audio data

Description

Read audio CSV, XLS or XLSX files from Sociometrics. Original Excel sheets have often a nested column structure where 1 or more badges produce data over several subcolumns. This strucutre will be converted into a tidy data format.

Usage

read_audio(
  file,
  type,
  ses_info = F,
  replv = F,
  delim = "\t",
  format = NULL,
  tz = NULL,
  na.rm = F,
  cls = NULL,
  ...
)

Arguments

file

Path to source data file (xls, xlxs or csv).

type

Indicates the type of audio data to be imported (see details for available abbreviations).

ses_info

Logical. Extract and store session info from file path if available.

replv

Anonymization. Default replv=FALSE will leave original (badge) IDs in place. Set to replv=TRUE will replace IDs with numbers starting from 1..n. Provide replv=data.frame with values: First column holds original values, second column replacement values.

delim

Single delimiter character for reading CSV data. Ignored for Excel files.

format

Optional format for parsing timestamp data. If no format is specified two pre-established timestamp formats are tried ut. See parse.smtrx for details.

tz

String. Default tz=NULL will use system timezone (Sys.timezone()) and assign to timestamp. Useful for explicitly setting other than system timezone for timestamp data.

na.rm

Logical. Calls na.omit on the entire data frame after conversion to tidy format.

cls

Vector of class names. Default cls=NULL uses pre-defined sociometric classes associated with the type abbreviation. However, class names can be specified explicitly as well.

Details

Excel file reading is performed by readxl::read_excel function. Column type specification might be required at times with the col_type parameter, passed via the "..."

Volume, pitch and frequencies are avaible for the front- and back microphone which can be indicated by the "_F" or "_B" suffix on each abbreviation. If no suffix is included for both microphone data sheets (back + front) will be loaded. The following abbreviations are available for the type parameter:

  • "VOL[_F|_B]" - Volume. "VOL" will load both front- and back microphone sheet. "VOL_F" only front and "VOL_B" only back microphone data. Volume levels range between 0 and 1. Values < 0.01 indicate not speaking, 0.01 - 0.02 speaking quitely, 0.03 - 0.08 speaking louder, and > 0.08 speaking loudly.

  • "PITCH[_F|_B]" - Pitch. Depending on the DataLab export settings pitch measures are aggregated over a given time interval, starting from 1 - 60+ seconds. Typical male fundamental frequency ranges from 85 to 180 Hz; typical adult female from 165 to 255 Hz.

  • "SP" - Speech profile (no front/back microphone option). The speech profile indicates for each badge Speaking, Overlap, Listening, Silent, Total Speaking and Total Silent duration. Values depend on the chosen time interval: if the speech profile has been exported over a period of 60s, Speaking (and all other) measures indicate fractions over the 60 second period, ranging from 0 to 60. If the speech profile has been exported over 1 second intervals, the columns indicate values that range from 0 to 1 second. Speaking indicates the total time fraction a particular badge wearer was speaking; Overlap the time fraction a person was speaking while someone else was speaking; Listening the time fraction a particular badge was silent while someone else was speaking; Silent the time fraction nobody was speaking. Total Speaking is the sum of Speaking + Overlap per badge.

  • "PAR" - Speech participation (no front/back microphone option). Logical. Indicates if a particular badge was speaking or not during the given time interval.

  • "VOL_MIR[_F|_B]" - Volume mirroring. Similar indicates the similarity between volume readings between two badges and ranges between 0 (no match) and 1 (perfect match) within the given time interval. Lag indicates the time lag between matches.

  • "VOL_CON[_F|_B]" - Volume consistency of each badge’s front audio amplitude, as measured in Activity (volume) (front). Consistency ranges from 0 to 1, where 1 indicates no changes in speech amplitude, and 0 indicates the maximum amount of variation in speech amplitude.

  • "FRQ_[_F|_B]" - Dominant frequency. Contains three frequency bands hz_0, hz_1...hz_2 and corresponding amplitude readings amp_0, ...amp_2. Converted to tidy format, the resulting tibble contains the usual Timestamp, Badge.ID column followed by Band column indicating one of the three bands Band_0, ... Band_2 and two further columns Hz and Amplitude. There are potentially 4 frequency bands shown, hz_0 & amp_0 is the strongest PEAK in cepstrum, hz_1 & amp_1 is the second strongest PEAK, and so on. If there are fewer than k peaks in cepstrum, the hz_k and larger values are empty. E.g if there are only two peaks in cepstrum, hz_2 and hz_3 are empty and not exported.

  • "TT" - Turn taking sheet. Speaking Segment: Any continuous, uninterrupted length of speech made by a single person. Turns: Turns are speaking segments that occur after and within 10 seconds of, another speaking segment. By default a speech segment must be made within 10 seconds after the previous one ended in order to be considered a turn. Self-turn: A speaker starts speaking, pauses for greater than 0.5 seconds (but less than 10 seconds), and then resumes speaking. Successful interruptions: Person A is talking. Peron B starts talking over A. If Person A talks for less than 5 out of the next 10 seconds, then Person B successfully interrupted Person A. Unsuccessful interruptions: Person A is talking. Peron B starts talking over A. If Person A talks for more than 5 out of the next 10 seconds, then Person B successfully interrupted Person A. Pause: A pause is a period of time within which there is no speaking. All pauses are between .5s and 10s.

Value

Tibble with data in tidy format

See Also

read_body read_interaction


jmueller17/sociometrics documentation built on March 20, 2024, 1:04 a.m.