hybridIndex: Compute hybrid index from genotypes, files, or numeric values

View source: R/hybridIndex.r

hybridIndexR Documentation

Compute hybrid index from genotypes, files, or numeric values

Description

Provides a unified way to obtain the hybrid index regardless of input format. The function accepts:

  • a genotype matrix or data frame (ideally polarised);

  • a 4-column numeric matrix in the I4 format (summaries of genotype counts);

  • a file containing hybrid index values produced by diem;

  • one or more genotype files in diem format together with ploidy information;

  • or a numeric vector of hybrid index values.

Usage

hybridIndex(
  x,
  ChosenInds = "all",
  rescale = FALSE,
  ploidy = NULL,
  ChosenSites = "all",
  changePolarity = NULL
)

Arguments

x

Either a genotype matrix/data.frame, a path to a text file containing hybrid indices, one or more genotype files in diem format, or a numeric vector of hybrid index values.

ChosenInds

A numeric or logical vector of indices of individuals to be included in the analysis.

rescale

Logical, whether to linearly rescale the resulting hybrid indices to the interval 0–1. Defaults to FALSE.

ploidy

A logical or a list of length equal to length of files. Each element of the list contains a numeric vector with ploidy numbers for all individuals specified in the files.

ChosenSites

A logical vector indicating which sites are to be included in the analysis.

changePolarity

A logical vector or a list of logical vectors with length equal to the number of markers.

Details

The function returns a numeric vector of hybrid indices and can optionally subset individuals or rescale the values to the interval 0-1.

Input type is detected automatically:

  • Hybrid-index file - the last column is extracted. The file may optionally contain the header "HybridIndex". No filtering is applied unless ChosenInds is specified.

  • Numeric vector - values are returned unchanged (except optional subsetting and rescaling).

  • I4 matrix - a 4-column numeric matrix where each row contains genotype summary counts. Each row is processed directly by pHetErrOnStateCount(row).

  • Genotype matrix - typically polarised genotypes from importPolarized. Each row is converted to state counts via sStateCount() and then passed to pHetErrOnStateCount().

  • Ploidy-aware multi-file input - if x is a character vector of files and ploidy and changePolarity are supplied, ploidy-aware hybrid indices are calculated for an optional subset of individuals (ChosenInds) and sites (ChosenSites).

If rescale = TRUE, the hybrid index is mapped to

[0,1]

. If all values are equal or non-finite, the original scale is preserved and a warning issued.

Missing values are replaced with 0.5, reflecting the default hybrid index for samples with no usable genotype information.

Value

A numeric vector of hybrid index values. Names are not preserved.

See Also

pHetErrOnStateCount, sStateCount, importPolarized

Examples

hybridIndex(c(0.3, 0.5, 0.7))
hybridIndex(c(0.3, 0.5, 0.7), rescale = TRUE)

hybridIndex(1:10, ChosenInds = 1:5, rescale = TRUE)

filepaths <- c(
  system.file("extdata", "data7x3.txt", package = "diemr"),
  system.file("extdata", "data7x10.txt", package = "diemr")
)

ploidies <- list(
  rep(2, 7),
  c(2, 1, 2, 2, 2, 1, 2)
)

hybridIndex(x = filepaths, ploidy = ploidies, 
  changePolarity = c(FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, 
                     TRUE, TRUE, FALSE, TRUE, TRUE))


diemr documentation built on Dec. 11, 2025, 5:07 p.m.