SequenceTrack class and methods

Share:

Description

A track class to represent genomic sequences. The two child classes SequenceDNAStringSetTrack and SequenceBSgenomeTrack do most of the work, however in practise they are of no particular relevance to the user.

Usage

1
2
SequenceTrack(sequence, chromosome, genome, name="SequenceTrack",
importFunction, stream=FALSE, ...)

Arguments

sequence

A meta argument to handle the different input types, making the construction of a SequenceTrack as flexible as possible.

The different input options for sequence are:

An object of class DNAStringSet. The individual DNAStrings are considered to be the different chromosome sequences.

An object of class BSgenome. The Gviz package tries to follow the BSgenome philosophy in that the respective chromosome sequences are only realized once they are first accessed.

A character scalar: in this case the value of the sequence argument is considered to be a file path to an annotation file on disk. A range of file types are supported by the Gviz package as identified by the file extension. See the importFunction documentation below for further details.

chromosome

The currently active chromosome of the track. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE). Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx, where x may be any possible string. The user has to make sure that sequences for the respective chromosomes are indeed part of the object. If not provided here, the constructor will set it to the first available sequence. Please note that by definition all objects in the Gviz package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<- replacement method in order to change to a different active chromosome.

genome

The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. For a SequenceBSgenomeTrack object, the genome information is extracted from the input BSgenome package. For a DNAStringSet it has too be provided or the constructor will fall back to the default value of NA.

name

Character scalar of the track's name used in the title panel when plotting.

importFunction

A user-defined function to be used to import the sequence data from a file. This only applies when the sequence argument is a character string with the path to the input data file. The function needs to accept an argument file containing the file path and has to return a proper DNAStringSet object with the sequence information per chromosome. A set of default import functions is already implemented in the package for a number of different file types, and one of these defaults will be picked automatically based on the extension of the input file name. If the extension can not be mapped to any of the existing import function, an error is raised asking for a user-defined import function. Currently the following file types can be imported with the default functions: fa/fasta and 2bit.

Both file types support indexing by genomic coordinates, and it makes sense to only load the part of the file that is needed for plotting. To this end, the Gviz package defines the derived ReferenceSequenceTrack class, which supports streaming data from the file system. The user typically does not have to deal with this distinction but may rely on the constructor function to make the right choice as long as the default import functions are used. However, once a user-defined import function has been provided and if this function adds support for indexed files, you will have to make the constructor aware of this fact by setting the stream argument to TRUE. Please note that in this case the import function needs to accept a second mandatory argument selection which is a GRanges object containing the dimensions of the plotted genomic range. As before, the function has to return an appropriate DNAStringSet object.

stream

A logical flag indicating that the user-provided import function can deal with indexed files and knows how to process the additional selection argument when accessing the data on disk. This causes the constructor to return a ReferenceSequenceTrack object which will grab the necessary data on the fly during each plotting operation.

...

Additional items which will all be interpreted as further display parameters. See settings and the "Display Parameters" section below for details.

Value

The return value of the constructor function is a new object of class SequenceDNAStringSetTrack, SequenceBSgenomeTrack ore ReferenceSequenceTrack, depending on the constructor arguments. Typically the user will not have to be troubled with this distinction and can rely on the constructor to make the right choice.

Objects from the class

Objects can be created using the constructor function SequenceTrack.

details

Depending on the available space the class will use different options to plot a sequence. If single letters can be accomodated without overplotting those will be show. Otherwise, colored boxes will be used to indicate letters, and if there is not enough horizontal room to show those, a simple line will indicate presence of a sequence. The min.width and fontsize display parameters directly control this behaviour. Each of the five possible nucleotides (G, A, T, C, and N) will be endoded in a separate color. As default we use the colors suggested in the biovizBase package, but a user is free to set their own color scheme by providing a named character vector with color as display parameter fontcolor, with names equal to the five possible bases.

Slots

chromosome:

Object of class "character", the chromosome on which the track is defined. There can only be a single chromosome for one track. Throughout the package, chromosome name have to be entered either as a single integer scalar or as a character scalar of the form chrXYZ, where XYZ may be an arbitrary character string.

genome:

Object of class "character", the genome for which the track is defined. This should be a valid UCSC genome identifier, however this may not always be formally checked upon object instantiation.

dp:

Object of class DisplayPars, inherited from class GdObject.

name:

Object of class "character", inherited from class GdObject

imageMap:

Object of class ImageMap, inherited from class GdObject

Extends

Class "GdObject", directly.

Methods

In the following code chunks, obj is considered to be an object inheriting from class SequenceTrack.

Exported in the name space:

chromosome

signature(GdObject="SequenceTrack"): return the chromosome for which the track is defined.

Usage:

chromosome(GdObject)

Examples:

chromosome(obj)

chromosome<-

signature(GdObject="SequenceTrack"): replace the value of the track's chromosome. This has to be a valid UCSC chromosome identifier or an integer or character scalar that can be reasonably coerced into one.

Usage:

chromosome<-(GdObject, value)

Additional Arguments:

value: replacement value.

Examples:

chromosome(obj) <- "chr12"

genome

signature(x="SequenceTrack"): return the track's genome.

Usage:

genome(x)

Examples:

genome(obj)

genome<-

signature(x="SequenceTrack"): set the track's genome. Usually this has to be a valid UCSC identifier, however this is not formally enforced here.

Usage:

genome<-(x, value)

Additional Arguments:

value: replacement value.

Examples:

genome(obj) <- "mm9"

length

signature(x="SequenceTrack"): return the number of nucleotides in the track's sequence.

Usage:

length(x)

Examples:

length(obj)

seqnames

signature(x="SequenceTrack"): return the names (i.e., the chromosome) of the sequences contained in the object.

Usage:

values(x)

Examples:

seqnames(obj)

subseq

signature(x="SequenceTrack"): Extract a sub-sequence from the track.

Usage:

subseq(x, start=NA, end=NA, width=NA)

Additional Arguments:

start: the start coordinate for the sub-sequence.

end: the end coordinate for the sub-sequence.

width: the width of the sub-sequence.

Examples:

subseq(obj, 1, 10)

Internal methods:

initialize

signature(.Object="SequenceTrack"): initialize the object.

Inherited methods:

displayPars

signature(x="SequenceTrack", name="character"): list the value of the display parameter name. See settings for details on display parameters and customization.

Usage:

displayPars(x, name)

Examples:

displayPars(obj, "col")

displayPars

signature(x="SequenceTrack", name="missing"): list the value of all available display parameters. See settings for details on display parameters and customization.

Examples:

displayPars(obj)

getPar

signature(x="SequenceTrack", name="character"): alias for the displayPars method. See settings for details on display parameters and customization.

Usage:

getPar(x, name)

Examples:

getPar(obj, "col")

getPar

signature(x="SequenceTrack", name="missing"): alias for the displayPars method. See settings for details on display parameters and customization.

Examples:

getPar(obj)

displayPars<-

signature(x="SequenceTrack", value="list"): set display parameters using the values of the named list in value. See settings for details on display parameters and customization.

Usage:

displayPars<-(x, value)

Examples:

displayPars(obj) <- list(col="red", lwd=2)

setPar

signature(x="SequenceTrack", value="character"): set the single display parameter name to value. Note that display parameters in the SequenceTrack class are pass-by-reference, so no re-assignmnet to the symbol obj is necessary. See settings for details on display parameters and customization.

Usage:

setPar(x, name, value)

Additional Arguments:

name: the name of the display parameter to set.

Examples:

setPar(obj, "col", "red")

setPar

signature(x="SequenceTrack", value="list"): set display parameters by the values of the named list in value. Note that display parameters in the SequenceTrack class are pass-by-reference, so no re-assignmnet to the symbol obj is necessary. See settings for details on display parameters and customization.

Examples:

setPar(obj, list(col="red", lwd=2))

names

signature(x="SequenceTrack"): return the value of the name slot.

Usage:

names(x)

Examples:

names(obj)

names<-

signature(x="SequenceTrack", value="character"): set the value of the name slot.

Usage:

names<-(x, value)

Examples:

names(obj) <- "foo"

coords

signature(ImageMap="SequenceTrack"): return the coordinates from the internal image map.

Usage:

coords(ImageMap)

Examples:

coords(obj)

tags

signature(x="SequenceTrack"): return the tags from the internal image map.

Usage:

tags(x)

Examples:

tags(obj)

drawAxis

signature(GdObject="SequenceTrack"): add a y-axis to the title panel of a track if necessary. Unless overwritten in one of the sub-classes this usualy does not plot anything and returns NULL.

Usage:

drawAxis(x, ...)

Additional Arguments:

...: all further arguments are ignored.

Examples:

Gviz:::drawAxis(obj)

drawGrid

signature(GdObject="SequenceTrack"): superpose a grid on top of a track if necessary. Unless overwritten in one of the sub-classes this usualy does not plot anything and returns NULL.

Usage:

drawGrid(GdObject, ...)

Additional Arguments:

...: additional arguments are ignored.

Examples:

Gviz:::drawGrid(obj)

Display Parameters

The following display parameters are set for objects of class SequenceTrack upon instantiation

size=null: Numeric scalar. The size of the track item. Defaults to auto-detect the size based on the other parameter settings.

fontcolor=getBioColor("DNA_BASES_N"): Character vector. The colors used for the 5 possible nucleotides (G, A, T, C, N). Defaults to use colors as defined in the biovizBase package.

fontsize=10: Numeric scalar. Controls the size of the sequence and thus also the level of plotable details.

fontface=2: Numeric scalar. The face of the font.

lwd=2: Numeric scalar. The width of the line when no indiviual letters can be plotted due to size limitations.

col="darkgray": Character scalar. The color of the line when no indiviual letters can be plotted due to size limitations.

min.width=2: Numeric scalar. The minimum width of the colored boxes that are drawn when no indiviual letters can be plotted due to size limitations.

showTitle=FALSE: Logical scalar. Do not show a title panel by default.

background.title="transparent": Character scalar. Make the title panel transparent by default.

col.border.title="transparent": Integer or character scalar. The border color for the title panels.

lwd.border.title=1: Integer scalar. The border width for the title panels.

noLetters=FALSE: Logical scalar. Always plot colored boxes (or a line) regardles of the available space.

add53=FALSE: Logical scalar. Add a direction indicator.

add53=FALSE: Logical scalar. Plot the sequence complement.

Additional display parameters are being inherited from the respective parent classes. Note that not all of them may have an effect on the plotting of SequenceTrack objects.

GdObject:

alpha=1: Numeric scalar. The transparency for all track items.

background.panel="transparent": Integer or character scalar. The background color of the content panel.

cex=1: Numeric scalar. The overall font expansion factor for all text.

cex.axis=NULL: Numeric scalar. The expansion factor for the axis annotation. Defaults to NULL, in which case it is computed based on the available space.

cex.title=NULL: Numeric scalar. The expansion factor for the title panel. This effects the fontsize of both the title and the axis, if any. Defaults to NULL, which means that the text size is automatically adjusted to the available space.

col.axis="white": Integer or character scalar. The font and line color for the y axis, if any.

col.frame="lightgray": Integer or character scalar. The line color used for the panel frame, if frame==TRUE

col.grid="#808080": Integer or character scalar. Default line color for grid lines, both when type=="g" in DataTracks and when display parameter grid==TRUE.

col.line=NULL: Integer or character scalar. Default colors for plot lines. Usually the same as the global col parameter.

col.symbol=NULL: Integer or character scalar. Default colors for plot symbols. Usually the same as the global col parameter.

col.title="white": Integer or character scalar. The font color for the title panels.

collapse=TRUE: Boolean controlling wether to collapse the content of the track to accomodate the minimum current device resolution. See collapsing for details.

fill="lightgray": Integer or character scalar. Default fill color setting for all plotting elements, unless there is a more specific control defined elsewhere.

fontface.title=2: Integer or character scalar. The font face for the title panels.

fontfamily="sans": Integer or character scalar. The font family for all text.

fontfamily.title="sans": Integer or character scalar. The font family for the title panels.

frame=FALSE: Boolean. Draw a frame around the track when plotting.

grid=FALSE: Boolean, switching on/off the plotting of a grid.

h=-1: Integer scalar. Parameter controlling the number of horizontal grid lines, see panel.grid for details.

lineheight=1: Numeric scalar. The font line height for all text.

lty="solid": Numeric scalar. Default line type setting for all plotting elements, unless there is a more specific control defined elsewhere.

lty.grid="solid": Integer or character scalar. Default line type for grid lines, both when type=="g" in DataTracks and when display parameter grid==TRUE.

lwd.grid=1: Numeric scalar. Default line width for grid lines, both when type=="g" in DataTracks and when display parameter grid==TRUE.

min.distance=1: Numeric scalar. The minimum pixel distance before collapsing range items, only if collapse==TRUE. See collapsing for details.

min.height=3: Numeric scalar. The minimum range height in pixels to display. All ranges are expanded to this size in order to avoid rendering issues. See collapsing for details.

showAxis=TRUE: Boolean controlling whether to plot a y axis (only applies to track types where axes are implemented).

v=-1: Integer scalar. Parameter controlling the number of vertical grid lines, see panel.grid for details.

Author(s)

Florian Hahne

See Also

AnnotationTrack

DataTrack

DisplayPars

GdObject

GeneRegionTrack

GRanges

ImageMap

IRanges

BSgenome

DNAStringSet

plotTracks

settings

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
## An empty object
SequenceTrack()

## Construct from DNAStringSet
library(Biostrings)
letters <- c("A", "C", "T", "G", "N")
set.seed(999)
seqs <- DNAStringSet(c(chr1=paste(sample(letters, 100000, TRUE),
collapse=""), chr2=paste(sample(letters, 200000, TRUE), collapse="")))
sTrack <- SequenceTrack(seqs, genome="hg19")
sTrack

## Construct from BSGenome object
if(require(BSgenome.Hsapiens.UCSC.hg19)){
sTrack <- SequenceTrack(Hsapiens)
sTrack
}


## Set active chromosome
chromosome(sTrack)
chromosome(sTrack) <- "chr2"
head(seqnames(sTrack))





## Plotting
## Sequences
plotTracks(sTrack, from=199970, to=200000)
## Boxes
plotTracks(sTrack, from=199800, to=200000)
## Line
plotTracks(sTrack, from=1, to=200000)
## Force boxes
plotTracks(sTrack, from=199970, to=200000, noLetters=TRUE)
## Direction indicator
plotTracks(sTrack, from=199970, to=200000, add53=TRUE)
## Sequence complement
plotTracks(sTrack, from=199970, to=200000, add53=TRUE, complement=TRUE)
## Colors
plotTracks(sTrack, from=199970, to=200000, add53=TRUE, fontcolor=c(A=1,
C=1, G=1, T=1, N=1))

## Track names
names(sTrack)
names(sTrack) <- "foo"

## Accessors
genome(sTrack)
genome(sTrack) <- "mm9"
length(sTrack)

## Sequence extraction
subseq(sTrack, start=100000, width=20)
## beyond the stored sequence range
subseq(sTrack, start=length(sTrack), width=20)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.