getIBDsegments: XIBDs IBD Segment Detection

Description Usage Arguments Value Examples

Description

Detects genomic regions shared IBD between pairs.

Usage

1
2
3
getIBDsegments(ped.genotypes, parameters, model = NULL, chromosomes = NULL,
  number.cores = 1, minimum.snps = 20, minimum.length.bp = 50000,
  error = 0.001, posterior = FALSE)

Arguments

ped.genotypes

a named list containing pedigree, genotypes and model. See Value description in getGenotypes for more details. Note the family IDs and individual IDs in pedigree must match the family IDs and individual IDs in the header of genotypes.

parameters

a data frame containing meioses and IBD probability estimates for all pairwise combinations of samples. See Value description in getIBDparameters for more details.

model

an integer of either 1 or 2 denoting which of the two models should be run.

  1. model=1 is based on the HMM implemented in PLINK (Purcell et al., 2007) which assumes the SNPs are in linkage equilibrium (LE). This often requires thinning of markers prior to use.

  2. model=2 is based on the HMM implemented in RELATE (Albrechtsen et al., 2009) which allows SNPs to be in LD and implicitly accounts for the LD through conditional emission probabilities where the current genotype probability is conditioned on the genotype of a single previous SNP.

The default is model=NULL and the model listed in ped.genotypes is used. NOTE: model=2 can only be used if model=2 is in ped.genotypes.

chromosomes

a numeric vector containing a subset of chromosomes to perform IBD analysis on. The default is chromosomes=NULL and IBD analysis will be performed on all chromosomes in ped.genotypes.

number.cores

the number of cores used for parallel execution.

minimum.snps

the minimum number of SNPs in an IBD segment for it to be reported. The default value is 20 SNPs.

minimum.length.bp

the minimum length of a reported IBD segment. The default value is 50,000 bp.

error

the genotyping error rate. The default value is 0.001.

posterior

a logical value indicating whether posterior probabilities for each pairwise analysis should be returned. The posterior probability is calculated for each SNP as posteriorPr(IBD=1)/2 + posteriorPr(IBD=2) using the forward and backward variables (Rabiner, 1989). A data frame containing probabilities for each SNP and each pairwise analysis is returned. This data frame can be very large when there are many SNPs and many pairwise analyses, and the run-time of getIBDsegments() will increase. The default is posterior=FALSE; posterior=TRUE is not recommended for large datasets.

Value

A named list of 1 object when posterior=FALSE and 2 objects when posterior=TRUE. The first object in the list, ibd_segments, is a data frame with information:

  1. Family 1 ID (type "character")

  2. Individual 1 ID (type "character")

  3. Family 2 ID (type "character")

  4. Individual 2 ID (type "character")

  5. Chromosome (type "numeric" or integer)

  6. SNP identifier (type "character")

  7. Start SNP (type "character")

  8. End SNP (type "character")

  9. Start position bp (type "numeric" or integer)

  10. End position bp (type "numeric" or integer)

  11. Start position M (type "numeric")

  12. End position M (type "numeric")

  13. Number of SNPs (type "numeric" or integer)

  14. Length bp (type "numeric" or integer)

  15. Length M (type "numeric")

  16. IBD status (1 = one allele shared IBD, 2 = two alleles shared IBD) (type "numeric" or integer)

where each row is a unique IBD segment for a pair of individuals. The data frame is headed fid1, iid1, fid2, iid2, chr, start.snp, end.snp, start.position.bp, end.position.bp, start.position.M, end.position.M, number.snps, length.bp, length.M, ibd.status. The second object (returned when posterior=TRUE), posterior_probabilities, is a data frame with the first four columns

  1. Chromosome (type "numeric" or integer)

  2. SNP identifier (type "character")

  3. Genetic map distance (type "numeric")

  4. Base-pair positions (type "numeric" or integer)

and columns 5 onwards are the posterior probabilities for each pair with pair identifier headers. Rows correspond to SNPs.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
# infer IBD
my_ibd <- getIBDsegments(ped.genotypes = example_genotypes,
                         parameters = example_parameters,
                         model = NULL,
                         chromosomes = NULL,
                         number.cores = 1,
                         minimum.snps = 20,
                         minimum.length.bp = 50000,
                         error = 0.001,
                         posterior = FALSE)

str(my_ibd)

## End(Not run)

bahlolab/XIBD documentation built on May 11, 2019, 5:24 p.m.