pivot_textgrid_tiers: Pivot a textgrid into wide format, respecting nested tiers
In tjmahr/readtextgrid: Read in a 'Praat' 'TextGrid' File

View source: R/pivot.R

pivot_textgrid_tiers

R Documentation

Pivot a textgrid into wide format, respecting nested tiers

Description

Pivot a textgrid into wide format, respecting nested tiers

Usage

pivot_textgrid_tiers(data, tiers, join_cols = "file")

Arguments

`data`	a textgrid dataframe created with `read_textgrid()`
`tiers`	character vector of tiers to pivot into wide format. When `tiers` has more than 1 element, the tiers are treated as nested. For example, if `tiers` is `c("utterance", "word", "phone")`, where `"utterance"` intervals contain `"word"` intervals which in turn contain `"phone"` intervals, the output will have one row per `"phone"` interval and include `⁠utterance_⁠` and `⁠word_⁠` columns for the utterance and word intervals that contain each phone interval. `tiers` should be ordered from broadest to narrowest (e.g, `"word"` preceding `"phone"`).
`join_cols`	character vector of the columns that will uniquely identify a textgrid file. Defaults to `"file"` because these columns have identical values for tiers read from the same textgrid file.

Details

For the joining nested intervals, two intervals a and b are combined into the same row if they match on the values in the join_cols columns and if the a$xmin <= b$xmid and b$xmid <= a$xmax. That is, if the midpoint of b is contained inside the interval a.

Value

a dataframe with just the intervals from tiers named in tiers converted into a wide format. Columns are renamed so that the text column is pivot into columns named after the tier names. For example, the text column in a words tier is renamed to words. The xmax, xmin, annotation_num, tier_num, tier_type are also prefixed with the tier name. For example, the xmax column in a words tier is renamed to words_xmax. An additional helper column xmid is added and prefixed appropriately. See examples below.

Examples

data <- example_textgrid(3) |>
  read_textgrid()
data

# With a single tier, we get just that tier with the columns prefixed with
# the tier_name
pivot_textgrid_tiers(data, "utterance")
pivot_textgrid_tiers(data, "words")

# With multiple tiers, intervals in one tier that contain intervals in
# another tier are combined into the same row.
a <- pivot_textgrid_tiers(data, c("utterance", "words"))
cols <- c(
  "utterance", "utterance_xmin", "utterance_xmax",
  "words", "words_xmin", "words_xmax"
)
a[cols]

a <- pivot_textgrid_tiers(data, c("utterance", "words", "phones"))
cols <- c(cols, "phones", "phones_xmin", "phones_xmax")
a[cols]

tjmahr/readtextgrid documentation built on Dec. 24, 2024, 12:49 a.m.