plosives: Spanish intervocalic plosives.

Description Usage Format Note References

Description

A dataset containing measures of plosive strength for instances of intervocalic Spanish /p/, /t/, /k/, /b/, /d/ and /g/. The data are taken from read speech and informal interviews of 30 speakers in Cuzco, Peru and 8 speakers in Lima, Peru; and from 18 speakers from Valladolid, Spain in the task dialogues in the Spanish portion of the Glissando Corpus (Garrido et al. 2013). If you analyze the plosives dataset in a publication, please cite Eager (2017) from the references section below.

Usage

1

Format

A data frame with 5281 rows and 21 variables:

cdur

Total plosive duration, measured from preceding vowel intensity maximum to following vowel intensity maximum, in milliseconds. Set to 0 for elided plosives.

vdur

Duration of the period of voicelessness in the vowel-consonant-vowel sequence in milliseconds. Set to 0 for fully voiced plosives and elided plosives.

vpct

Percentage of the consonant duration which was voiceless. For non-elided plosives, vpct = vdur / cdur, and for elided plosives, vpct = 0.

intdiff

The maximum intensity in the following vowel minus the minimum intensity in the plosive, in decibels. Set to 0 for elided plosives.

intvel

The maximum rising velocity of the intensity contour between the consonant minimum intensity and following vowel maximum intensity, in decibels per millisecond. Set to 0 for elided plosives.

voicing

The underlying voicing of the plosive (Voiced or Voiceless).

place

Place of articulation (Bilabial, Dental, or Velar).

stress

Syllabic stress context (Tonic, Post-Tonic, or Unstressed).

prevowel

Preceding vowel phoneme identity (a, e, i, o, or u).

posvowel

Following vowel phoneme identity (a, e, i, o, or u).

wordpos

Position of the plosive in the word (Initial or Medial).

wordfreq

Number of times the word containing the plosive occurs in the CREA corpus (Real Academia Espanola).

speechrate

Local speech rate around the consonant in nuclei per second (nuclei located using De Jong and Wempe's (2009) script).

spont

Whether the speech was spontaneous (TRUE) or read (FALSE).

dialect

The city where the speaker is from (Cuzco, Lima, or Valladolid).

sex

The speaker's sex (Female or Male).

age

The speaker's age group (Older or Younger) based on whether they were over 40 years old, or 40 years old or younger at the time of recording.

ed

The speaker's highest level of education (Secondary or University).

ling

The speaker's language background (Bilingual or Monolingual) based on whether they spoke only Spanish, or both Spanish and Quechua.

speaker

Speaker identifier (s01 through s56).

item

Read speech item identifier (i01 through i54). Set to NA for spontaneous speech.

Note

The ptk dataset in the standardize package is a subset of the plosives dataset, but with the speakers renumbered:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
d <- droplevels(subset(plosives,
  dialect == "Valladolid" & voicing == "Voiceless"))

levels(d$speaker)  # s39 to s56
levels(ptk$speaker)  # s01 to s18

levels(d$speaker) <- levels(ptk$speaker)
d <- d[, colnames(ptk)]
rownames(d) <- NULL

all.equal(d, ptk)  # TRUE

References

Eager, Christopher D. (2017). Contrast preservation and constraints on individual phonetic variation. Doctoral thesis. University of Illinois at Urbana-Champaign.

Garrido, J. M., Escudero, D., Aguilar, L., Cardenoso, V., Rodero, E., de-la-Mota, C., ... Bonafonte, A. (2013). Glissando: a corpus for multidisciplinary prosodic studies in Spanish and Catalan. Language Resources and Evaluation, 47(4), 945-971.

Real Academia Espanola. Corpus de referencia del espanol actual (CREA). Banco de Datos. http://www.rae.es

De Jong, N. H., & Wempe, T. (2009). Praat script to detect syllable nuclei and measure speech rate automatically. Behavior Research Methods, 41(2), 385-390.


CDEager/nauf documentation built on May 6, 2019, 9:24 a.m.