VcfStack | R Documentation |
The VcfStack
class is a vector of related VCF files, for instance
each file representing a separate chromosome. The class helps manage these
files as a group. The RangedVcfStack
class extends VcfStack
by
associating genomic ranges of interest to the collection of VCF files.
VcfStack(files=NULL, seqinfo=NULL, colData=NULL,
index=TRUE, check=TRUE)
Creates a VcfStack object.
files
A VcfFilelist object. If a VcfFile or character vector is given a VcfFileList will be coerced. The character vector should be files paths pointing to VCF files. The character vector must be named, with names correspond to seqnames in each VCF file.
seqinfo
A Seqinfo object describing the levels genome and circularity of each sequence.
colData
An optional DataFrame describing each sample in the VcfStack. When present, row names must correspond to sample names in the VCF file.
index
A logical indicating if the vcf index files should be created.
check
A logical indicating if the check across samples should be performed
RangedVcfStack(vs=NULL, rowRanges=NULL)
Creates a RangedVcfStack object.
vs
A VcfStack
object.
rowRanges
An optional GRanges object associating
the genomic ranges of interest to the collection of VCF
files. The seqnames of rowRanges
are a subset of
seqnames(vs)
. If missing, a default is created from
the seqinfo
object of the provided VcfStack
.
In the code below, x
is a VcfStack or RangedVcfStack object.
Get the number of files and samples in the VcfStack
object.
Get the sample names in the VcfStack
.
Get the names of the files in VcfStack
.
Get the names of samples and the names of files in VcfStack
.
Get or set the files on x
. value
can be a named
character() of file paths or a
VcfFileList. The return value will be a
VcfFileList.
Get or set the seqinfo on x
. See seqinfo<-
for details on new2old
and pruning.mode
.
Set the seqlevels according to the supplied style. File names and rowRanges will also be updated if applicable. See seqlevelsStyle<- for more details.
Get or set the colData
on x
. value
is a
DataFrame.
Get or set the rowRanges
on x
. x
has to be a
RangedVcfStack
object. value
is a
GRanges.
In the code below, x
is a VcfStack or RangedVcfStack
object. i
is a GRanges object,
character() vector of seqnames,
numeric() vector, logical() vector, or can be missing. For a
RangedVcfStack object, assay and readVcfStack will use the associated
rowRanges
object for i
.
Returns a CharacterList
of all
available VCF fields, with names of fixed
, info
,
geno
and samples
indicating the four categories.
Each element is a character() vector of available VCF field
names within each category.
Get matrix of genotype calls from the VCF files.
See genotypeToSnpMatrix. Argument i
specifies which files to read. BPPARAM
is the argument to
the bpmapply.
Get content of VCF files in the VcfStack. i
indicates which
files to read. j
can be missing or a character() vector of
sample names (see
samples) present in the
VCF files. param
is a
ScanVcfParam object. If param
is
used i
and j
are ignored.
Display abbreviated information about VcfStack
or
RangedVcfStack
object.
In the code below, x
is a VcfStack or RangedVcfStack
object.
Get elements from ranges i
and samples j
as a
VcfStack or RangedVcfStack object. Note: for a RangedVcfStack
,
the rowRanges
object will also be subset.
i
can be missing, a character() vector of
seqnames, numeric() vector of
indexes, logical() or GRanges
object. When i
is a
GRanges
object, seqnames(i)
is then
used to subset the files in the VcfStack.
j
can be missing, a character() vector of sample names, a
numeric(), logical() vector.
Deprecated. Use files(vs)[chrtok]
instead.
Translate seqnames chrtoks
to 1000 genomes genotype VCF urls.
Lori Shepherd mailto:Lori.Shepherd@RoswellPark.org and Martin Morgan mailto:Martin.Morgan@RoswellPark.org
VcfFile, VcfFileList.
## ---------------------------------------------------------------------
## CONSTRUCTION
## ---------------------------------------------------------------------
## point to VCF files and add names corresponding to the sequence
## present in the file
extdata <- system.file(package="GenomicFiles", "extdata")
files <- dir(extdata, pattern="^CEUtrio.*bgz$", full=TRUE)
names(files) <- sub(".*_([0-9XY]+).*", "\\1", basename(files))
## input data.frame describing the length of each sequence, coerce to
## 'Seqinfo' object
seqinfo <- as(readRDS(file.path(extdata, "seqinfo.rds")), "Seqinfo")
stack <- VcfStack(files, seqinfo)
stack
## Use seqinfo from VCF files instead of explict value
stack2 <- VcfStack(files)
rstack <- RangedVcfStack(stack)
gr <- GRanges(c("7:1-159138000", "X:1-155270560"))
rstack2 <- RangedVcfStack(stack, gr)
rstack2
## ---------------------------------------------------------------------
## ACCESSORS
## ---------------------------------------------------------------------
dim(stack)
colnames(stack)
rownames(stack)
dimnames(stack)
head(files(stack))
seqinfo(stack)
colData(stack)
## ---------------------------------------------------------------------
## METHODS
## ---------------------------------------------------------------------
readVcfStack(stack, i=GRanges("20:862167-62858306"))
i <- GRanges(c("20:862167-62858306", "7:1-159138000"))
readVcfStack(stack, i=i, j="NA12891")
head(assay(stack, gr))
head(assay(rstack2))
seqlevels(stack2)
rownames(stack2)
seqlevelsStyle(stack2)
seqlevelsStyle(stack2) <- "UCSC"
seqlevelsStyle(stack2)
seqlevels(stack2)
rownames(stack2)
vcfFields(stack2)
## ---------------------------------------------------------------------
## SUBSETTING
## ---------------------------------------------------------------------
## select rows 4, 5, 6 and samples 1, 2
stack[4:6, 1:2]
## select rownames "7", "11" and sample "NA12891"
stack[c("7", "11"), "NA12891"]
stack[c("7", "11", "X"), 2:3]
## subset with GRanges
stack[GRanges("20:862167-62858306")]
rstack2[]
rstack2[,1]
## ---------------------------------------------------------------------
## HELPERS
## ---------------------------------------------------------------------
paths1kg(1:3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.