segIBD | R Documentation |
Segment based probability of alleles to be IBD (identical by descent): For each pair of individuals the probability is computed that two alleles taken at random position from randomly chosen haplotypes belong to a shared segment.
segIBD(files, map, minSNP=20, minL=1.0, unitP="Mb", unitL="Mb",
a=0.0, keep=NULL, skip=NA, cskip=NA, cores=1, quiet=FALSE)
files |
This parameter is either (1) A vector with names of phased marker files, one file for each chromosome, or (2) A list with two components. Each component is a vector with names of phased marker files, one file for each chromosome. Each components corresponds to a different set of individuals. This enables to compute the kinship between individuals stored in two different files. File names must contain the chromosome name as specified in the |
map |
Data frame providing the marker map with columns including marker name |
minSNP |
Minimum number of marker SNPs included in a segment. |
minL |
Minimum length of a segment in |
unitP |
The unit for measuring the proportion of the genome included in shared segments.
Possible units are the number of marker SNPs included in shared segments ( |
unitL |
The unit for measuring the length of a segment. Possible units are the number of marker SNPs included in the segment ( |
a |
The Function providing the weighting factor for each segment is w(x)=x*x/(a+x*x). The parameter of the function is the length of the segment in |
keep |
If |
skip |
Take line |
cskip |
Take column |
cores |
Number of cores to be used for parallel processing of chromosomes. By default one core is used. For |
quiet |
Should console output be suppressed? |
For each pair of individuals the probability is computed that two SNPs taken at random position from randomly chosen haplotypes belong to a shared segment.
Genotype file format: Each file containing phased genotypes has a header and no row names. Cells are separated by blank spaces. The number of rows is equal to the number of markers from the respective chromosome and the markers are in the same order as in the map
. The first cskip
columns are ignored. The remaining columns contain genotypes of individuals written as two alleles separated by a character, e.g. A/B, 0/1, A|B, A B, or 0 1. The same two symbols must be used for all markers. Column names are the IDs of the individuals. If the blank space is used as separator then the ID of each individual should repeated in the header to get a regular delimited file. The columns to be skipped and the individual IDs must have no white spaces.
NxN
segment-based kinship matrix with N
being the number of individuals.
Robin Wellmann
de Cara MAR, Villanueva B, Toro MA, Fernandez J (2013). Using genomic tools to maintain diversity and fitness in conservation programmes. Molecular Ecology. 22: 6091-6099
data(map)
dir <- system.file("extdata", package = "optiSel")
files <- file.path(dir, paste("Chr", unique(map$Chr), ".phased", sep=""))
f <- segIBD(files, map, minSNP=15, minL=1.0)
mean(f)
#[1] 0.05677993
f <- segIBD(files, map, minSNP=15, minL=1.0, cores=NA)
mean(f)
#[1] 0.05677993
## Multidimensional scaling of animals:
## (note that only few markers are used)
data(Cattle)
library("smacof")
D <- sim2dis(f, 4)
color <- c(Angler="red", Rotbunt="green", Fleckvieh="blue", Holstein="black")
col <- color[as.character(Cattle$Breed)]
Res <- smacofSym(D, itmax = 5000, eps = 1e-08)
plot(Res$conf, pch=18, col=col, main="Multidimensional Scaling", cex=0.5)
mtext(paste("segIBD Stress1 = ", round(Res$stress,3)))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.