The Aligned Corpus Toolkit
(act) is a R package that is designed for linguists that work with time aligned transcription data. It offers functions to
import and export various annotation file formats ('ELAN' .eaf, 'EXMARaLDA .exb and 'Praat' .TextGrid files),
create print transcripts in the style of conversation analysis,
search transcripts (span searches across multiple annotations, search in normalized annotations, make concordances etc.),
export and re-import search results (.csv and 'Excel' .xlsx format),
create cuts for the search results (print transcripts, audio/video cuts using 'FFmpeg')
create video sub titles in 'Subrib title' .srt format,
modify the data in a corpus (search/replace, delete, filter etc.),
interact with 'Praat' using 'Praat'-scripts (e.g. to open a search result in Praat),
interact with 'ELAN' (currently opening a search result in ELAN), and
exchange data with the 'rPraat' package.
The package is itself written in R and may be expanded by other users.
License: GPL-3
Author: Oliver Ehmer
Email: oliver.ehmer@romanistik.uni-freiburg.de
Website: http://www.oliverehmer.de
Package website: here.
CRAN site: here
Creating the act package took a lot of time and effort. Please cite it when you publish research.
Ehmer, Oliver (2021). act: Aligned Corpus Toolkit. R package version 1.2.2. https://cran.r-project.org/package=act
To install the package in R use the following commands.
Install from CRAN:
install.packages("act")
Install the development version from GitHub:
install.packages("remotes")
remotes::install_github("oliverehmer/act")
Load the package:
library(act)
An example data set including anntoation and media files is available: * Download a ZIP file at [GitHub] (https://github.com/oliverehmer/act_examplecorpus)
You might be interested in the following R packages, that functionally overlap with the act package.
ExmaraldaR
on GitHub
FRelan
on GitHub
phonfieldwork
on CRAN and GitHub
rPraat
on CRAN and GitHub
* textgRid
on CRAN
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.