summary_statistics: Summary Statistics per Segment
In PhHermann/LDJump: Estimating Variable Recombination Rates from Population Genetic Data

Description Usage Arguments Value Author(s) References See Also Examples

This function computes summary statistics for every segment of the sequence. Sequence files are generated within this function which are then used by LDhat and other packages to estimate all necessary parameters.

1 2	summary_statistics(x, s, segLength, segs, seqName, nn, pathLDhat, pathPhi, status, polyThres, out, format, startofseq)

`x`	An integer control variable for the considered segment of the DNA sequence.
`s`	An `XStringSet` object which is read by `readDNAStringSet`
`segLength`	An integer value for the length of the segments, provided by the user. The default value of 1000 is our recommended value (1kb). The number of resulting segments, based on the sequence length is calculated within the funtion.
`segs`	A (non-negative) integer which reflects the number of segments considered. It is calculated in the program based on the user-defined `segLength`.
`seqName`	A character string containing the full path and the name of the sequence file in `fasta` of `vcf` format. It is necessary to add the extension ("fileName.fa", "fileName.fasta", "fileName.vcf") in order to run `LDJump`. In case that `format` equals to `DNABin` the seqName equals to the name of the `DNABin`-object (without any extension).
`nn`	An integer which reflects the number of individuals (more precisely sequences) of the population to be analyzed. In case of diploid samples this is twice the number of individuals.
`pathLDhat`	A character string containing the path to LDhat. This path and the installation of LDhat is necessary for the computation of the package.
`pathPhi`	A character string containing the path to PhiPack. This path and the installation of PhiPack is necessary for the computation of the package.
`status`	an optional logical value: by default `TRUE` such that the current processing status of the segments is printed.
`polyThres`	a numeric value between 0 and 1. Used in data manipulation function `DNAbin2genind`: the minimum frequency of a minor allele for a locus to be considered as polymorphic (default to 0).
`out`	an optional character string: by default an empty string "". Can be set to any user-defined string in order to rename all output files used within `LDJump`. This parameter enables to run `LDJump` from the same directory without creating interfering files in the working directory.
`format`	a character string describing the format of the used file g.e. "fasta" or "vcf". The default is set to "fasta".
`startofseq`	an integer value describing at which position the sequence to be analyzed starts (Only required when running `LDJump` with VCF-Files). The starting value is provided to `vcftools` to select the appropriate range for splicing the VCF-File into segments. In `summary_statistics`, the same value is used to loop over each FASTA-segment.

This function returns a concatenated vector of all computed summary statistis as:

`hahe`	The haplotype heterozygosity of the considered segment. Returned with stats.
`tajd`	Tajima's D. Only used in the regression model for demography.
`haps`	The number of haplotypes. Later on it is normalized by sequence length and number of individuals.
`apwd`	Average pairwise differences. Later it is normalized by sequence length.
`vapw`	Variance of pairwise differences. Later it is normalized by sequence length.
`wath`	Watterson's theta. Later it is normalized by sequence length.
`phis`	A vector containing the four summary statistics obtained from PhiPack as MaxChi, NSS, mean(Phi) and var(Phi).

Philipp Hermann philipp.hermann@jku.at, Andreas Futschik, Fardokhtsadat Mohammadi fardokht.fm@gmail.com

Auton, A. and McVean, G. (2007). Recombination rate estimation in the presence of hotspots. Genome Research, 17(8), 1219–1227.

Bruen, T. C., Philippe, H., and Bryant, D. (2006). A simple and robust statistical test for detecting the presence of recombination. Genetics, 172(4):2665-2681.

Jombart T. and Ahmed I. (2011) adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics. doi:10.1093/bioinformatics/btr521

Hermann, P., Heissl, A., Tiemann-Boege, I., and Futschik, A. (2019), LDJump: Estimating Variable Recombination Rates from Population Genetic Data. Mol Ecol Resour. doi:10.1111/1755-0998.12994.

McVean, G. A. T., Myers, S. R., Hunt, S., Deloukas, P., Bentley, D. R., and Donnelly, P. (2004). The fine-scale structure of recombination rate variation in the human genome. Science, 304(5670), 581–584.

Paradis E., Claude J. & Strimmer K. 2004. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20: 289-290.

LDJump, vcfR_to_fasta, getPhi, get_smuce, readDNAStringSet, DNAbin2genind

1
2
3

##### Do not run these examples                                                      #####
##### In LDJump.R the function is called as follows                                  #####
##### sapply(1:segs,summary_statistics,s=s,segs=segs,seqName=seqName,nn=nn,ll = ll)  #####

PhHermann/LDJump documentation built on Nov. 16, 2019, 12:53 p.m.

PhHermann/LDJump index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

PhHermann/LDJump
Estimating Variable Recombination Rates from Population Genetic Data

summary_statistics: Summary Statistics per Segment
In PhHermann/LDJump: Estimating Variable Recombination Rates from Population Genetic Data

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to summary_statistics in PhHermann/LDJump...

R Package Documentation

Browse R Packages

We want your feedback!

PhHermann/LDJump Estimating Variable Recombination Rates from Population Genetic Data

summary_statistics: Summary Statistics per Segment In PhHermann/LDJump: Estimating Variable Recombination Rates from Population Genetic Data

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to summary_statistics in PhHermann/LDJump...

R Package Documentation

Browse R Packages

We want your feedback!

PhHermann/LDJump
Estimating Variable Recombination Rates from Population Genetic Data

summary_statistics: Summary Statistics per Segment
In PhHermann/LDJump: Estimating Variable Recombination Rates from Population Genetic Data