Description Usage Arguments Details Value References See Also Examples
View source: R/Zalpha_Zscore.R
Returns a Zalpha value for each SNP location supplied to the function, based on the expected r^2 values given an LD profile and genetic distances. For more information about the Zalpha statistic, please see Jacobs (2016). The Zalpha statistic is defined as:
{Z_{α}^{Zscore}}=\frac{{|L| \choose 2}^{-1}∑_{i,j \in L}\frac{r^2_{i,j}-E[r^2_{i,j}]}{σ[r^2_{i,j}]} + {|R| \choose 2}^{-1}∑_{i,j \in R}\frac{r^2_{i,j}-E[r^2_{i,j}]}{σ[r^2_{i,j}]}}{2}
where |L|
and |R|
are the number of SNPs to the left and right of the current locus within the given window ws
, r^2 is equal to
the squared correlation between a pair of SNPs, E[r^2] is equal to the expected squared correlation between a pair of SNPs, given an LD profile, and σ[r^2] is the standard deviation.
1 2 3 4 5 6 7 8 9 10 11 12 | Zalpha_Zscore(
pos,
ws,
x,
dist,
LDprofile_bins,
LDprofile_rsq,
LDprofile_sd,
minRandL = 4,
minRL = 25,
X = NULL
)
|
pos |
A numeric vector of SNP locations |
ws |
The window size which the Zalpha statistic will be calculated over. This should be on the same scale as the |
x |
A matrix of SNP values. Columns represent chromosomes; rows are SNP locations. Hence, the number of rows should equal the length of the |
dist |
A numeric vector of genetic distances (e.g. cM, LDU). This should be the same length as |
LDprofile_bins |
A numeric vector containing the lower bound of the bins used in the LD profile. These should be of equal size. |
LDprofile_rsq |
A numeric vector containing the expected r^2 values for the corresponding bin in the LD profile. Must be between 0 and 1. |
LDprofile_sd |
A numeric vector containing the standard deviation of the r^2 values for the corresponding bin in the LD profile. |
minRandL |
Minimum number of SNPs in each set R and L for the statistic to be calculated. Default is 4. |
minRL |
Minimum value for the product of the set sizes for R and L. Default is 25. |
X |
Optional. Specify a region of the chromosome to calculate Zalpha for in the format |
The LD profile describes the expected correlation between SNPs at a given genetic distance, generated using simulations or
real data. Care should be taken to utilise an LD profile that is representative of the population in question. The LD
profile should consist of evenly sized bins of distances (for example 0.0001 cM per bin), where the value given is the (inclusive) lower
bound of the bin. Ideally, an LD profile would be generated using data from a null population with no selection, however one can be generated
using this data. See the create_LDprofile
function for more information on how to create an LD profile.
A list containing the SNP positions and the Zalpha values for those SNPs
Jacobs, G.S., T.J. Sluckin, and T. Kivisild, Refining the Use of Linkage Disequilibrium as a Robust Signature of Selective Sweeps. Genetics, 2016. 203(4): p. 1807
1 2 3 4 5 6 7 8 9 | ## load the snps and LDprofile example datasets
data(snps)
data(LDprofile)
## run Zalpha_Zscore over all the SNPs with a window size of 3000 bp
Zalpha_Zscore(snps$bp_positions,3000,as.matrix(snps[,3:12]),snps$cM_distances,
LDprofile$bin,LDprofile$rsq,LDprofile$sd)
## only return results for SNPs between locations 600 and 1500 bp
Zalpha_Zscore(snps$bp_positions,3000,as.matrix(snps[,3:12]),snps$cM_distances,
LDprofile$bin,LDprofile$rsq,LDprofile$sd,X=c(600,1500))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.