Description Usage Value Excitation Patterns Visual Analog Scale Ratings Source
Access the internal data set for productions of target sibilant fricatives by 16 adult native speakers and 69 two- to three-year-old native learners of American English, who participated in the Learning to Talk Project. The participants' productions were elicited in word-initial pre-vocalic position during a real word repetition task.
1 |
A tibble
with 2475 rows (i.e., target sibilant
fricative productions) and 14 columns (i.e., variables):
SessionDate
: The date on which the participant completed the
session, in YYYY-MM-DD
format.
Session
: An alphanumeric code for the session. Each adult
completed two sessions; each child completed one.
Adult
: A logical vector indicating whether the participant
is an adult (= TRUE
) or a child (= FALSE
).
Participant
: An alphanumeric code for the participant.
Age
: An integer vector, the ages, in months, of the children
and NA_integer_
for the adults.
Female
: A logical vector indicating whether the participant
is female (= TRUE
) or male (= FALSE
).
MAE
: A logical vector indicating whether the audio prompts
in the session were presented in Mainstream American English
(= TRUE
) or African American English (= FALSE
).
StimulusSet
: An integer vector indicating which (multi)set
of sibilant-initial words were elicited during the session. Adults
completed two sessions with stimulus sets 2 and 3. Children completed
one session with either stimulus set 1 or 2, depending on their age:
children 32 months and younger completed set 1; children 34
months and older completed set 2.
Trial
: The trial number within the session when the
production was elicited.
Orthography
: The orthographic transcription of the word
presented during the Trial
, used to elicit a production of a
sibilant fricative.
Target
: The WorldBet transcription of the target sibilant
fricative.
Transcription
: A broad WorldBet transcription of the
produced sibilant fricative. Note: s:S
denotes a produced
sibilant whose place of articulation was judged to be intermediate
between s
and S
, but closer to S
; and conversely
for S:s
.
Rating
: A numeric vector, NA_real_
for productions
by adults; the mean rating along a visual analog scale for productions
by children. See "Visual Analog Scale Ratings" section below.
ExcitationPattern
: A list-column of 361-component numeric
vectors, each of which represents the values of an excitation pattern
computed from the production. These values are associated to the
vector of center frequencies, on the ERB scale:
seq(from = 3, to = 39, by = 0.1)
.
See "Excitation Patterns" section below.
An excitation pattern is a type of psychoacoustic spectrum that represents the distribution of auditory excitation across auditory filters. To compute an excitation pattern, the auditory periphery was modeled by a bank of 361 bandpass filters. Each filter was a fourth-order, zero-phase gammatone filter. The center frequencies of the filters were uniformly spaced from 3 to 39 along the ERB scale (i.e., 0.1 inter-filter spacing). The bandiwidth of each filter was proportional to its center frequency; hence, the filters were wider at high frequencies. These features model how the basilar membrane compresses the frequency scale logarithmically, and is differentially tuned to different frequency components. The excitation pattern of an acoustic waveform is computed by filtering it through the gammatone bank, summing the energy at the output of each filter, and associating the output energy of each filter to its center frequency.
See data-raw/03-excitation-patterns
in the source package for the
code used to compute the vectors in the list-column ExcitationPattern
of this data set.
Each production by a child was used as a stimulus in a visual-analog-scale perceptual rating task. From the recording of each of these whole-word productions, the initial CV sequence was extracted, beginning 5 ms prior to the onset of sibilant frication and ending 150 ms after the onset of voicing for the vowel. Batches of these extracted sequences were then presented to 70 listeners who were all native monolingual speakers of American English between the ages of 18 and 50 years and who reported no current or previous speech, language, or hearing disorder.
On each trial in the perceptual rating task, the listener saw a double-headed arrow anchored by the text "the 's' sound" at one end and "the 'sh' sound" at the other. The stimulus was played once, and the listener was asked to rate where the initial consonant fell on this visual analog scale by clicking at an appropriate location along the arrow. The click location in pixels was logged automatically, and the pixel locations have been normalized to fall within the [0,1] range, where 0 denotes an ideal /s/, and 1 denotes an ideal /S/.
Listeners were given no explicit instructions on what criteria they should
use to judge the fricative. They were encouraged to use their 'gut instinct'.
Each stimulus was rated by at least 15 listeners. The mean normalized
rating (i.e., within the [0,1] range) was computed for each stimulus, and
this mean rating is the number that appears in the Rating
column
of this data set.
See learningtotalk.org for more information about the Learning to Talk Project.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.