seqUnitApply | R Documentation |
Applies a user-defined function to each variant unit.
seqUnitApply(gdsfile, units, var.name, FUN, as.is=c("none", "list", "unlist"),
parallel=FALSE, ..., .bl_size=256L, .progress=FALSE, .useraw=FALSE,
.padNA=TRUE, .tolist=FALSE, .envir=NULL)
gdsfile |
a |
units |
a list of units of selected variants, with S3 class
|
var.name |
the variable name(s), see details |
FUN |
the function to be applied |
as.is |
returned value: a list, an integer vector, etc; return nothing
by default |
parallel |
|
.bl_size |
chuck size, the increment for load balancing, 256 for units |
.progress |
if |
.useraw |
|
.padNA |
|
.tolist |
if |
.envir |
NULL, an environment object, or a list/data.frame |
... |
optional arguments to |
The variable name should be "sample.id"
, "variant.id"
,
"position"
, "chromosome"
, "allele"
, "genotype"
,
"annotation/id"
, "annotation/qual"
, "annotation/filter"
,
"annotation/info/VARIABLE_NAME"
, or
"annotation/format/VARIABLE_NAME"
.
"@genotype"
, "annotation/info/@VARIABLE_NAME"
or
"annotation/format/@VARIABLE_NAME"
are used to obtain the index
associated with these variables.
"$dosage"
is also allowed for the dosages of reference allele (integer:
0, 1, 2 and NA for diploid genotypes).
"$dosage_alt"
returns a RAW/INTEGER matrix for the dosages of alternative
allele without distinguishing different alternative alleles.
"$dosage_sp"
returns a sparse matrix (dgCMatrix) for the dosages of
alternative allele without distinguishing different alternative alleles.
"$num_allele"
returns an integer vector with the numbers of distinct
alleles.
"$ref"
returns a character vector of reference alleles
"$alt"
returns a character vector of alternative alleles (delimited by
comma)
"$chrom_pos"
returns characters with the combination of chromosome and
position, e.g., "1:1272721". "$chrom_pos_allele"
returns characters with
the combination of chromosome, position and alleles, e.g., "1:1272721_A_G"
(i.e., chr:position_REF_ALT).
"$variant_index"
returns the indices of selected variants starting
from 1, and "$sample_index"
returns the indices of selected samples
starting from 1.
A vector, a list of values or none.
Xiuwen Zheng
seqUnitSlidingWindows
, seqUnitFilterCond
# open the GDS file
gdsfile <- seqOpen(seqExampleFileName("gds"))
# variant units via sliding windows
units <- seqUnitSlidingWindows(gdsfile)
v1 <- seqUnitApply(gdsfile, units, "genotype", function(x) dim(x)[3L],
as.is="unlist", .progress=TRUE)
v2 <- seqUnitApply(gdsfile, units, "genotype", function(x) dim(x)[3L],
as.is="unlist", parallel=2, .progress=TRUE)
all(v1 == lengths(units$index))
all(v1 == v2)
# call with an external R variable
ext <- list(x=1:1348/10)
v3 <- seqUnitApply(gdsfile, units, "$:x", function(x) x,
as.is="list", .progress=TRUE, .envir=ext)
head(units$index)
head(v3)
table(sapply(seq_along(units$index), function(i) all(units$index[[i]] == v3[[i]]*10)))
# all TRUE
# close the GDS file
seqClose(gdsfile)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.