haploSep: Separates haplotypes

View source: R/haploSep.R

haploSepR Documentation

Separates haplotypes

Description

Implementation of interative minimization algorithm for jointly estimating structure of essential haplotypes as well as their relative proportion from allele frequency matrix of mixture.

Usage

haploSep(
  data,
  nHaplo,
  stabEval = TRUE,
  bias = TRUE,
  weight = NULL,
  nBoot = 15,
  fBoot = 0.95
)

Arguments

data

Numeric observation matrix with nrow(data) being the number of SNP locations and ncol(data) the number of samples (e.g. time points).

nHaplo

Integer value which gives the number of essential haplotypes for which haplotype structure and frequency should be estimated from the mixture. When missing, it is estimated via the SVD criterion.

stabEval

logical. If TRUE (default), the stability analysis of the reconstruction is carried out, otherwise only haplotype structure and frequency are computed.

bias

logical. It indicates whether bias term (constant over SNP locations but different for various samples) should be added. By default, it is TRUE

weight

matrix of ncol(weight)=ncol(data) and nrow(weight)=nrow(data) with weights for the observations least squares optimization step. By default, equal weights are applied.

nBoot

integer. It gives number of bootstrap repetitions for stability score. By default, it takes value 15. It is only needed when stabEval = TRUE.

fBoot

numeric. Its value lies between 0 and 1 which gives the relative subsample size in the bootstrap runs. By default, it takes value 0.95. It is only needed when stabEval = TRUE.

Value

List with entries haploFrq and haploStr, which is an object from class haplo.

  • haploFrq is a matrix with nrow(haploFrq) = nHaplo and ncol(haploFrq) = ncol(data) which gives the estimated frequency of the estimated essential haplotypes.

  • haplotStr is a matrix with nrow(haploStr) = nrow(data) and ncol(haploStr) = nHaplo and entries being either 0 or 1. It gives the estimated haplotype structure of essential haplotypes.

The returned list has an attribute "nHaplo", which is the number of essential haplotypes.

If stabEval = TRUE the returned list has three more attributes "R2", "stabIntFrq" and "stabScoreStr".

  • The attribute "R2" is an inidicator of how good the model fitting is, in a similar spirit as the R^2 for linear models, see lm and summary.lm.

  • The attribute "stabIntFrq" provides a confidence envelope for the estimated haplotype frequency. This is a data frame, containing "lowerBnd" and "upperBnd" for each haplotype, which are 0.025 and 0.975 quantiles for the bootstrap samples, respectively.

  • The attribute "stabScoreStr" is the a numeric vector of length nHaplo with values between 0 and 1.

Examples

 1. Reconstruct 5 haplotypes
 data(ExampleDataset)
 haploSep(data = Y, nHaplo = 5, stabEval = TRUE, bias = TRUE)
 
 Choose the number of haplotypes to be reconstructed with haploSelect
 data(ExampleDataset)
 m <- haploSelect(data = Y, bias = TRUE)
 haploSep(data = Y, nHaplo = m, stabEval = TRUE, bias = TRUE)

MartaPelizzola/haploSep documentation built on May 26, 2023, 11:36 a.m.