LR_test: Performs LR calculation on a set of speakers

View source: R/LR_test.r

LR_testR Documentation

Performs LR calculation on a set of speakers

Description

Iterates through all pairings in a set of test speakers and calculates the likelihood ratio for each speaker-pair. GMM-UBM and MVKD-based approaches available for use.

Usage

LR_test(
  data,
  test_speakers = NULL,
  bg_speakers = NULL,
  data_col = NULL,
  test_data = NULL,
  test_data_col = NULL,
  cross_full = TRUE,
  bg_from = c("data", "test_data", "both"),
  mode = c("gmm_ubm", "mvkd"),
  ...
)

Arguments

data

A data frame. The first column identifies speakers.

test_speakers

(optional) A vector. Specifies the set of speakers for which LR calculations should be derived. Takes all speakers from data if unspecified.

bg_speakers

(optional) A vector. Specifies the set of speakers to form the background model for LR calculations. Equal to test_speakers if unspecified.

data_col

(optional) A vector. Specifies which columns in data to take into account in LR calculation. Uses all available columns if unspecified (excludes first column, which identifies speakers).

test_data

(optional) A data frame in the same format as data. If specified, suspect/known-speaker data will be taken from here. Offender/disputed-speaker data will be taken from data. By default, background data follow offender and come from data, but this behaviour can be changed using bg_from.

test_data_col

(optional) A vector. Specifies which columns in data to take into account in LR calculation. Follows data_col if unspecified.

cross_full

(optional) Boolean. When test_data and test_data_col are specified, determines if the full set of data from each speaker are used in the comparison. Default TRUE. If FALSE, suspect data come from first half of test_data and questioned data come from second half of data, to simulate the scenario when test_data is not specified. Useful for keeping number of data points constant, in the case of comparisons involving non-contemporaneous recordings or mismatched conditions and the results of which are compared with contemporaneous comparisons.

bg_from

(optional) "data" (default), "test_data" or "both". Determines where the background data come from. "both" pools data from the same speakers in data and test_data.

mode

"gmm_ubm" (default) or "mvkd".

...

Additional arguments, e.g. G (default 8) to specify how many components to use in GMM-UBM and r (default 16) to specify relevance factor for the speaker-adaptation step in GMM-UBM.

Value

A named list of 3 items:

  • likelihood_ratio_matrix: A data frame. Rows and columns are named after the speaker identifiers. Each row and column represents a speaker as suspect and offender respectively, and each cell contains a single LR score (in natural log).

  • cllr: Numeric. Reports the logLR cost.

  • eer: Numeric. Reports the equal error rate (between 0 and 1).


justinjhlo/fvclrr documentation built on June 27, 2022, 11:19 a.m.