Vowel: Vowel Recognition

Description Format Details Source References Examples

Description

Speaker independent recognition of the eleven steady state vowels of British English using a specified training set of lpc derived log area ratios.

Format

A data frame with 990 observations on the following 12 variables.

y

Class label indicating vowel spoken

subset

a factor with levels test train

x.1

a numeric vector

x.2

a numeric vector

x.3

a numeric vector

x.4

a numeric vector

x.5

a numeric vector

x.6

a numeric vector

x.7

a numeric vector

x.8

a numeric vector

x.9

a numeric vector

x.10

a numeric vector

Details

The speech signals were low pass filtered at 4.7kHz and then digitised to 12 bits with a 10kHz sampling rate. Twelfth order linear predictive analysis was carried out on six 512 sample Hamming windowed segments from the steady part of the vowel. The reflection coefficients were used to calculate 10 log area parameters, giving a 10 dimensional input space. For a general introduction to speech processing and an explanation of this technique see Rabiner and Schafer [RabinerSchafer78].

Each speaker thus yielded six frames of speech from eleven vowels. This gave 528 frames from the eight speakers used to train the networks and 462 frames from the seven speakers used to test the networks.

The eleven vowels, along with words demonstrating their sound, are: i (heed) I (hid) E (head) A (had) a: (hard) Y (hud) O (hod) C: (hoard) U (hood) u: (who'd) 3: (heard)

Source

https://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/vowel/

References

D. H. Deterding, 1989, University of Cambridge, "Speaker Normalisation for Automatic Speech Recognition", submitted for PhD.

Examples

1
2

Example output

      y               subset         x.1              x.2        
 Length:990         test :462   Min.   :-5.211   Min.   :-1.274  
 Class :character   train:528   1st Qu.:-3.888   1st Qu.: 1.052  
 Mode  :character               Median :-3.146   Median : 1.877  
                                Mean   :-3.204   Mean   : 1.882  
                                3rd Qu.:-2.603   3rd Qu.: 2.738  
                                Max.   :-0.941   Max.   : 5.074  
      x.3                x.4               x.5               x.6         
 Min.   :-2.48700   Min.   :-1.4090   Min.   :-2.1270   Min.   :-0.8360  
 1st Qu.:-0.97575   1st Qu.:-0.0655   1st Qu.:-0.7690   1st Qu.: 0.1960  
 Median :-0.57250   Median : 0.4335   Median :-0.2990   Median : 0.5520  
 Mean   :-0.50777   Mean   : 0.5155   Mean   :-0.3057   Mean   : 0.6302  
 3rd Qu.:-0.06875   3rd Qu.: 1.0960   3rd Qu.: 0.1695   3rd Qu.: 1.0285  
 Max.   : 1.43100   Max.   : 2.3770   Max.   : 1.8310   Max.   : 2.3270  
      x.7                 x.8                x.9                x.10         
 Min.   :-1.537000   Min.   :-1.29300   Min.   :-1.61300   Min.   :-1.68000  
 1st Qu.:-0.307000   1st Qu.:-0.09575   1st Qu.:-0.70400   1st Qu.:-0.54800  
 Median : 0.022000   Median : 0.32800   Median :-0.30250   Median :-0.15650  
 Mean   :-0.004365   Mean   : 0.33655   Mean   :-0.30298   Mean   :-0.07134  
 3rd Qu.: 0.296500   3rd Qu.: 0.77000   3rd Qu.: 0.09375   3rd Qu.: 0.37100  
 Max.   : 1.403000   Max.   : 2.03900   Max.   : 1.30900   Max.   : 1.39600  

customizedTraining documentation built on May 2, 2019, 2:31 p.m.