View source: R/gene_map_plot.R
gene_map_plot | R Documentation |
Uses base positions information to make a linear plot of genetic features to produce a gene map. Separates into "gene features" (plotted as large blocks on the main chromosome) and "extra features" (plotted as small bars offset from the main chromosome).
gene_map_plot(
mapDT,
genome_len = NULL,
gene_colour = NULL,
gene_type = c("gene", "rRNA"),
extra_type = c("tRNA", "D-loop"),
plot_xmin = 0,
plot_xmax = NULL,
plot_ymax = 5,
extra_ypos = 3,
gene_txt_size = 4,
extra_txt_size = 4,
font = "Arial",
gene_border = NA,
gene_size = 2
)
mapDT |
Data.table: genetic feature information. Requires the columns:
|
genome_len |
Integer: the genome length. Default = NULL. If unspecified,
will be assigned the final base pair of the last genetic feature in |
gene_colour |
Character: a vector of colours to plot genes. Each item
is a colour, with the gene accessible through |
gene_type |
Character: a vector of values present in |
extra_type |
Character: a vector of values present in |
plot_xmin |
Numeric: a single value, the minimum x-axis limit. Default is 0. |
plot_xmax |
Numeric: a single value, the maximum x-axis limit.
Default is the genome length, as per |
plot_ymax |
Numeric: a single value, the maximum y-axis limit. See Details for parameterisation. |
extra_ypos |
Numeric: a single value, the starting y-axis position for extra features. See Details for parameterisation. |
gene_txt_size |
Integer: a single value, the size for gene feature labels. Default is 4. |
extra_txt_size |
Integer: a single value, the size for extra feature labels. Default is 4. |
font |
Character: a single value, the font family to use.
Default is |
gene_border |
Character: a single value, the colour for borders around gene features. Default is NA, no border. |
gene_size |
Numeric: a single value, the thickness of borders around
gene features, if a colour if specified in |
There are two major features plotted, "gene features" and
"extra features". These names are just for convention: gene features are
plotted as large coloured bars in center of the plot on the main "chromosome",
whereas extra features are plotted as small grey bars above/below the
gene features, offset from the main chromosome. Anything could be plotted as
a gene or extra feature, and these are specified through gene_type
and extra_type
.
The name of the genetic feature being plotted is the value of mapDT$NAME
.
This value is effectively evaluated as a mathematical expression to allow
italics for gene names and mixed formatting in gene names. The internal function
call is the evaluation of values by geom_text(..., parse=TRUE)
and
geom_text_repel(..., parse=TRUE)
.
The value of mapDT$STRAND
dictates the position of the coloured bars.
A value of 1 places "genes" on the top of the genomic strand, whereas a value
of -1 places "genes" below the genomic strand.
The colour of the gene features is specified through gene_colour
as
a named vector. If there are two genes, 'COX1' and 'COX2', specification of
their colours can be done like so: c(COX1='pink', COX2='blue')
.
If colours are not specified, one colour is automatically assigned to each
unique "gene".
The value of extra_ypos
specifies that distance of the extra features
from the gene features. Set larger if things are looking squashed.
Additionally, plot_ymax
sets the maximal plotting area, so set this
value larger if things are not fitting well.
Returns a gg object.
library(genomalicious)
# Create a link to raw external datasets in genomalicious
genomaliciousExtData <- paste0(find.package('genomalicious'), '/extdata')
# Read in a GENBANK file of the Bathygobius cocosensis mitogenome
gbk.read <- mitoGbk2DT(paste(genomaliciousExtData, 'data_Bcocosensis.gbk', sep='/'))
head(gbk.read)
# Subset out the "CDS" types and plot genes, rRNA, tRNA, and D-loop.
# Rename rRNAs for nicer plotting. Because $NAME is evaluated by the
# expression() function, it is useful to put single quotations around characters
to have them read as characters internally by gene_map_plot().
gbk.read[TYPE!='CDS'] %>%
.[NAME=='12S ribosomal RNA', NAME:='12S'] %>%
.[NAME=='16S ribosomal RNA', NAME:='16S'] %>%
.[, NAME:=paste0("'", NAME, "'")] %>%
gene_map_plot(mapDT=., genome_len=16692, extra_txt_size=3)
# Plot just the COX genes and the D-loop as "gene features" with
# custom colours and a border. Again, not the use of single quotes nested in
double quotes, which will match up to the edited gene $NAME column below.
gene.col.vec <- c(
"'COX1'"='royalblue',
"'COX2'"='firebrick3',
"'COX3'"='mediumpurple2',
"'CYTB'"='plum3',
"'D-loop'"='grey40')
# Subset focal genes, add quotes to ensure characters are parsed as characters.
gbk.read[NAME %in% c('COX1','COX2','COX3','CYTB','D-loop')] %>%
.[, NAME:=paste0("'", NAME, "'")] %>%
gene_map_plot(
mapDT=., genome_len=16692,
gene_type=c('gene', 'D-loop'), gene_colour=gene.col.vec,
extra_type=NULL, gene_border='black')
# It is possible to parse characters without the double quotes, but note how
# the '-' character in 'D-loop' has been parsed as a minus symbol.
gbk.read[NAME %in% c('COX1','COX2','COX3','CYTB','D-loop')] %>%
gene_map_plot(
mapDT=., genome_len=16692, gene_type=c('gene', 'D-loop'),
extra_type=NULL, gene_border='black')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.