example.seqz: Example "seqz" data

Description Usage Format Details Source

Description

The “seqz” file is produced by sequenza-utils and typically has the file extension ‘.seqz’. The data here is representative of a seqz file derived from an exome-sequenced tumor sample, such as could be obtained from TCGA.

Usage

1

Format

A data frame with 53937 rows and 14 columns:

[,1] chromosome Chromosome name
[,2] position Base position
[,3] base.ref Base in the reference genome
[,4] depth.normal Read depth in the normal sample
[,5] depth.tumor Read depth in the tumor sample
[,6] depth.ratio Ratio of depth.tumor and depth.normal
[,7] Af A-allele frequency in the tumor sample
[,8] Bf B-allele frequency in the tumor sample, in heterozygous positions only
[,9] zygosity.normal Zygosity of the normal sample: "hom" for homozygous or "het" for heterozygous
[,10] GC.percent % GC content
[,11] good.reads Number of reads from the tumor sample which pass the quality threshold
[,12] AB.normal Base(s) found in the normal sample, sorted by allele frequency if more than one
[,13] AB.tumor Base(s) found in the tumor sample but not in the normal specimen, with their observed frequencies, separated by colons
[,14] tumor.strand Identical to AB.tumor but indicating, for each variant base, the fraction of reads oriented in the forward direction

Details

example.seqz can be loaded in the standard R way via data(example.seqz), or it can be read from a text file using read.seqz. The former is useful for examples and testing, whereas the latter is representative of the standard workflow.

Source

This is derived from a TCGA specimen, but has been scrambled to anonymize the source. The reference genome is hg19. The GC content was calculated in 50-base windows.


sequenza documentation built on May 9, 2019, 5:04 p.m.