example.seqz: Example "seqz" data
In sequenza: Copy Number Estimation from Tumor Genome Sequencing Data

Description Usage Format Details Source

The “seqz” file is produced by sequenza-utils and typically has the file extension ‘.seqz’. The data here is representative of a seqz file derived from an exome-sequenced tumor sample, such as could be obtained from TCGA.

1	data(example.seqz)

A data frame with 53937 rows and 14 columns:

[,1]	chromosome	Chromosome name
[,2]	position	Base position
[,3]	base.ref	Base in the reference genome
[,4]	depth.normal	Read depth in the normal sample
[,5]	depth.tumor	Read depth in the tumor sample
[,6]	depth.ratio	Ratio of `depth.tumor` and `depth.normal`
[,7]	Af	A-allele frequency in the tumor sample
[,8]	Bf	B-allele frequency in the tumor sample, in heterozygous positions only
[,9]	zygosity.normal	Zygosity of the normal sample: "hom" for homozygous or "het" for heterozygous
[,10]	GC.percent	% GC content
[,11]	good.reads	Number of reads from the tumor sample which pass the quality threshold
[,12]	AB.normal	Base(s) found in the normal sample, sorted by allele frequency if more than one
[,13]	AB.tumor	Base(s) found in the tumor sample but not in the normal specimen, with their observed frequencies, separated by colons
[,14]	tumor.strand	Identical to `AB.tumor` but indicating, for each variant base, the fraction of reads oriented in the forward direction

example.seqz can be loaded in the standard R way via data(example.seqz), or it can be read from a text file using read.seqz. The former is useful for examples and testing, whereas the latter is representative of the standard workflow.

This is derived from a TCGA specimen, but has been scrambled to anonymize the source. The reference genome is hg19. The GC content was calculated in 50-base windows.

sequenza documentation built on May 9, 2019, 5:04 p.m.