microhaplotype_geno_err_matrix: create matrix C for probability of observed genotypes from...
In eriqande/CKMRsim: Inference of Pairwise Relationships Using Likelihood Ratios

microhaplotype_geno_err_matrix

R Documentation

create matrix C for probability of observed genotypes from microhaplotype data

Description

This is intended for the case where the genotypes in question are composed of alleles that are actually the multi-SNP haplotypes obtained from next generation sequence data. In other words, all the SNPs occur on a single read and the phase is known because they are all together on the read. It allows for a SNP-specific sequencing error rate. The haplotypes must be named as strings of A, C, G, or T, (though they could be strings of any characters—the function isn't going to check that!) and for now we assume that if the SNPs are multiallelic then genotyping errors to either of the alternate alleles are equally likely. Currently assumes that genotyping errors are equally likely in either direction at a SNP, too.

Usage

microhaplotype_geno_err_matrix(
  haps,
  snp_err_rates = 0.005,
  dropout_rates = 0.005,
  scale_by_num_snps = FALSE
)

Arguments

`haps`	character vector of strings that denote the haplotypes at the locus. For example "CCAG", "CTAG", "GCAG", etc. Note that these should be in the same order as they are given in the allele frequency definitions (so that the ordering of genotypes made from them will be correct). Each element of haps must be a string of the same number of characters. haps cannot be a factor.
`snp_err_rates`	Vector of rates at which sequencing errors are expected at each of the SNPs that are in the haplotype. This recycles if its length is less than the number of SNPs in the haplotypes.
`dropout_rates`	Haplotype-specific rates of allelic dropout. Recycles if need be.
`scale_by_num_snps`	Logical. If true, then the error rate is divided by the number of SNPs in each microhaplotype.

Examples

# five haplotypes in alphabetical order
haps <- c("AACC", "GACC", "GATA", "GTCC", "GTTC")

# make the matrix C
C_mat <- microhaplotype_geno_err_matrix(haps)

# look at the first part of it
C_mat[1:5, 1:5]

eriqande/CKMRsim documentation built on June 12, 2025, 1:15 p.m.