bed_clumping | R Documentation |
For a bigSNP
:
snp_pruning()
: LD pruning. Similar to "--indep-pairwise (size+1) 1 thr.r2
"
in PLINK.
This function is deprecated (see
this article).
snp_clumping()
(and bed_clumping()
): LD clumping. If you do not provide
any statistic to rank SNPs, it would use minor allele frequencies (MAFs),
making clumping similar to pruning.
snp_indLRLDR()
: Get SNP indices of long-range LD regions for the
human genome.
bed_clumping(
obj.bed,
ind.row = rows_along(obj.bed),
S = NULL,
thr.r2 = 0.2,
size = 100/thr.r2,
exclude = NULL,
ncores = 1
)
snp_clumping(
G,
infos.chr,
ind.row = rows_along(G),
S = NULL,
thr.r2 = 0.2,
size = 100/thr.r2,
infos.pos = NULL,
is.size.in.bp = NULL,
exclude = NULL,
ncores = 1
)
snp_pruning(
G,
infos.chr,
ind.row = rows_along(G),
size = 49,
is.size.in.bp = FALSE,
infos.pos = NULL,
thr.r2 = 0.2,
exclude = NULL,
nploidy = 2,
ncores = 1
)
snp_indLRLDR(infos.chr, infos.pos, LD.regions = LD.wiki34)
obj.bed |
Object of type bed, which is the mapping of some bed file.
Use |
ind.row |
An optional vector of the row indices (individuals) that
are used. If not specified, all rows are used. |
S |
A vector of column statistics which express the importance
of each SNP (the more important is the SNP, the greater should be
the corresponding statistic). |
thr.r2 |
Threshold over the squared correlation between two SNPs.
Default is |
size |
For one SNP, window size around this SNP to compute correlations.
Default is |
exclude |
Vector of SNP indices to exclude anyway. For example,
can be used to exclude long-range LD regions (see Price2008). Another use
can be for thresholding with respect to p-values associated with |
ncores |
Number of cores used. Default doesn't use parallelism.
You may use |
G |
A FBM.code256
(typically |
infos.chr |
Vector of integers specifying each SNP's chromosome. |
infos.pos |
Vector of integers specifying the physical position
on a chromosome (in base pairs) of each SNP. |
is.size.in.bp |
Deprecated. |
nploidy |
Number of trials, parameter of the binomial distribution.
Default is |
LD.regions |
A |
snp_clumping()
(and bed_clumping()
): SNP indices that are kept.
snp_indLRLDR()
: SNP indices to be used as (part of) the 'exclude
'
parameter of snp_clumping()
.
Price AL, Weale ME, Patterson N, et al. Long-Range LD Can Confound Genome Scans in Admixed Populations. Am J Hum Genet. 2008;83(1):132-135. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.ajhg.2008.06.005")}
test <- snp_attachExtdata()
G <- test$genotypes
# clumping (prioritizing higher MAF)
ind.keep <- snp_clumping(G, infos.chr = test$map$chromosome,
infos.pos = test$map$physical.pos,
thr.r2 = 0.1)
# keep most of them -> not much LD in this simulated dataset
length(ind.keep) / ncol(G)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.