logomaker: Create logo plots from aligned sequences or positional...

Description Usage Arguments Value Examples

View source: R/logomaker.R

Description

Takes as input a vector of character sequences (aligned to have the ) same length or a positional frequency or weight matrix and plots the standard logo or the Enrichment Depletion (ED) Logo plots.

Usage

1
2
3
4
logomaker(data, type = c("Logo", "EDLogo"), bg = NULL, n_data = NULL,
  n_bg = NULL, tol = 0, shrink = TRUE, pseudocount = NULL,
  color_type = NULL, colors = NULL, color_seed = NULL,
  return_heights = FALSE, logo_control = list())

Arguments

data

The input data may be a vector of character sequences - representing aligned sequences of DNA, RNA or amino acids, or a matrix/ data frame with symbols of characters or strings of characters along the rows of the matrix/data frame and the positions or sites of the aligned sequences along the columns.

type

can either be "Logo" or "EDLogo" depending on if user wants to plot the standard Logo or the Enrichment Depletion Logo.

bg

The background probability, which defaults to NULL, in which case equal probability is assigned to each symbol. The user can however specify a vector (equal to in length to the number of symbols) which specifies the background probability for each symbol and assumes this background probability to be the same across the columns (sites), or a matrix, whose each cell specifies the background probability of the symbols for each position.

n_data

The number of sequences used to build the positional weight matrix (table).

n_bg

The number of sequences used for the background probabilities.

tol

The tolerance for the KL-divergence of the positional weight data and background probabilities.

shrink

A Boolean indicating whether to use the ash shrinkage on the positional weights or not.

pseudocount

A small pseudocount to be added mainly to bypass 0 entries. Default is NULL. If table is a counts matrix, the default changes to 0.5, if table is a positional weight matrix, the default becomes 0.001 times the minimum non-zero value of the table.

color_type

A list specifying the coloring scheme. Defaults to NULL, for which, based on color_seed, a specific coloring scheme is chosen. The list contains two elements - type and col.The type can be of three types - "per-row", "per-column" and "per-symbol". The col element is a vector of colors, of same length as number of rows in table for "per-row" (assigning a color to each string), of same length as number of columns in table for "per-column" (assuming a color for each column), or a distinct color for a distinct symbol in "per-symbol". For "per-symbol", the length of the color_profile$col should be same as library size of the logos, but if the vector of colors provided is more or less, we can downsample or upsample the colors as required. The colors are matched with the symbols in the total_chars.

colors

Add description here.

color_seed

A seed for choosing among multiple available coloring schemes in color_profile. The default choice is 2030. But the user can use any seed of her choice.

return_heights

Boolean. If TRUE, the function returns the stack heights for the logo plot.For standard Logo (type = "Logo"), it returns the information content. For tyep = "EDLogo", it returns the total stack height along positive and negative axis, as well as the breakdown of the heights along different symbols along the two axis. Defaults to FALSE.

logo_control

Control parameters for the logo plot. Check the input arguments from the plogomaker and nlogomaker functions.

Value

Returns a standard or EDLogo plot of the sequence of the positional frequency matrix based on the type is equal to Logo or EDLogo.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
sequence <- c("CTATTGT", "CTCTTAT", "CTATTAA", "CTATTTA", "CTATTAT",
              "CTTGAAT", "CTTAGAT", "CTATTAA", "CTATTTA", "CTATTAT",
              "CTTTTAT", "CTATAGT", "CTATTTT", "CTTATAT", "CTATATT",
              "CTCATTT", "CTTATTT", "CAATAGT", "CATTTGA", "CTCTTAT",
              "CTATTAT", "CTTTTAT", "CTATAAT", "CTTAGGT",
              "CTATTGT", "CTCATGT", "CTATAGT", "CTCGTTA",
              "CTAGAAT", "CAATGGT")

logomaker(sequence, type = "Logo")
logomaker (sequence, type = "EDLogo")

library(ggseqlogo)
data(ggseqlogo_sample)

sequence <- seqs_aa$AKT1
logomaker (sequence, type = "Logo")
logomaker (sequence, type = "EDLogo")

data("seqlogo_example")
logomaker(seqlogo_example, type = "Logo", return_heights = TRUE)
logomaker(seqlogo_example, type = "EDLogo", return_heights = TRUE)

kkdey/Logolas documentation built on May 20, 2019, 10:30 a.m.