read.sre: Function to read in an SRE data structure from file

read.sreR Documentation

Function to read in an SRE data structure from file

Description

This function reads a submission to a NIST-type Speaker Recognition Evaluation from file into an sre data frame. For supported evaluations, currently NIST 2008 and EVALITA 2009, it attempts to add target information to the data frame. It blesses the data.frame to be of class sre.

Usage

read.sre(file)

Arguments

file

A file containing trials in NIST submission format

Details

This function attempts to recognize the actual evaluation from the content. It does this by looking at the model IDs. It is important that these are the same as in the key sretools:::srekey. Then it adds the key information, i.e., whether each trial is a target or a non-taget trial in a colum target. Metadata in the trial list is read into columns, and additional metadata from the key is added. This metadata may vary per evaluation.

In a Unix environment, you can specify a command as file ending in a pipe symbol |, in which case the command is executed and the stdout of that command is used as input. This will allow for compressed files, concatenation of files and filtering.

An alternative to this function is read.tnt, which just reads target and non-target scores from file.

Value

The function returns a data.frame of class sre which has a minimum of three fields, score, dec and target. Common fields include model, test, gender, mcond, tcond, adapt. NIST SRE 2008 fields are chan, mlang, tlang, mtype, ttype, mmic and tmic. An EVALITA SRE 2009 additional field is channel.

score

The score of the detection trial

dec

The decision of the trials, either TRUE or FALSE

target

The truth about whether this was a target or non-target trial

model

Model (train) ID, or target speaker in the trial

test

Test segment ID of the trial (may need additional chan)

chan

The channel (side) of the test segment, a or b

gender

Sex of target (model) speaker f or m

mcond

The model (train) condition of the trial

tcond

The test condition of the trial

mlang

The language of the model (train segment)

tlang

The language of the test segment

mtype

Recording type of model segment interview or phonecall

ttype

Recording type of tyest segment

mmic

Transducer type of model segment mic or phn

tmic

Transsducer type of test segment

channel

The telephone channel of the test segment, PSTN P or GSM G

Note

In SRE, we have a problem that both ‘train’, ‘test’ and ‘target’ start with the same letter. In sretools we therefore adopt the nomenclature to reference to ‘train’ as ‘model’ so that we can use the character ‘m’. This may be a bit confusing. NIST has adopted another standard, calling the ‘test’ ‘segment’ (as if the training does not come from a speech segment).

The addition of that target and other metadata robably happens a bit clumsy. Models and test segment/channel need to be in the case as they are in the key data. This restriction may be relaxed in the future.

Author(s)

David A. van Leeuwen

See Also

read.tnt for an alternative, det.sre computing performance statistics and preparing for plotting, srekey for the information known about past SREs.

Examples

## prepare a NIST submission file
# read.sre("mysubmission.txt")
## 

davidavdav/ROC documentation built on Sept. 8, 2023, 2:39 p.m.