hybridIndex: Compute hybrid index from genotypes, files, or numeric values
In diemr: Genome Polarization via Diagnostic Index Expectation Maximization

hybridIndex

R Documentation

Compute hybrid index from genotypes, files, or numeric values

Description

Provides a unified way to obtain the hybrid index regardless of input format. The function accepts:

a genotype matrix or data frame (ideally polarised);
a 4-column numeric matrix in the I4 format (summaries of genotype counts);
a file containing hybrid index values produced by diem;
one or more genotype files in diem format together with ploidy information;
or a numeric vector of hybrid index values.

Usage

hybridIndex(
  x,
  ChosenInds = "all",
  rescale = FALSE,
  ploidy = NULL,
  ChosenSites = "all",
  changePolarity = NULL
)

Arguments

`x`	Either a genotype matrix/data.frame, a path to a text file containing hybrid indices, one or more genotype files in diem format, or a numeric vector of hybrid index values.
`ChosenInds`	A numeric or logical vector of indices of individuals to be included in the analysis.
`rescale`	Logical, whether to linearly rescale the resulting hybrid indices to the interval 0–1. Defaults to `FALSE`.
`ploidy`	A logical or a list of length equal to length of `files`. Each element of the list contains a numeric vector with ploidy numbers for all individuals specified in the `files`.
`ChosenSites`	A logical vector indicating which sites are to be included in the analysis.
`changePolarity`	A logical vector or a list of logical vectors with length equal to the number of markers.

Details

The function returns a numeric vector of hybrid indices and can optionally subset individuals or rescale the values to the interval 0-1.

Input type is detected automatically:

Hybrid-index file - the last column is extracted. The file may optionally contain the header "HybridIndex". No filtering is applied unless ChosenInds is specified.
Numeric vector - values are returned unchanged (except optional subsetting and rescaling).
I4 matrix - a 4-column numeric matrix where each row contains genotype summary counts. Each row is processed directly by pHetErrOnStateCount(row).
Genotype matrix - typically polarised genotypes from importPolarized. Each row is converted to state counts via sStateCount() and then passed to pHetErrOnStateCount().
Ploidy-aware multi-file input - if x is a character vector of files and ploidy and changePolarity are supplied, ploidy-aware hybrid indices are calculated for an optional subset of individuals (ChosenInds) and sites (ChosenSites).

If rescale = TRUE, the hybrid index is mapped to

[0,1]

. If all values are equal or non-finite, the original scale is preserved and a warning issued.

Missing values are replaced with 0.5, reflecting the default hybrid index for samples with no usable genotype information.

Value

A numeric vector of hybrid index values. Names are not preserved.

Examples

hybridIndex(c(0.3, 0.5, 0.7))
hybridIndex(c(0.3, 0.5, 0.7), rescale = TRUE)

hybridIndex(1:10, ChosenInds = 1:5, rescale = TRUE)

filepaths <- c(
  system.file("extdata", "data7x3.txt", package = "diemr"),
  system.file("extdata", "data7x10.txt", package = "diemr")
)

ploidies <- list(
  rep(2, 7),
  c(2, 1, 2, 2, 2, 1, 2)
)

hybridIndex(x = filepaths, ploidy = ploidies, 
  changePolarity = c(FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, 
                     TRUE, TRUE, FALSE, TRUE, TRUE))

diemr documentation built on Dec. 11, 2025, 5:07 p.m.