In lixiang117423/ggmotif: Extract and Visualize Motif Information from MEME Software

title: ggmotif author: Xiang Li date: '2022-06-30' slug: ggmotif categories: - Bioinformatics tags: - R subtitle: '' summary: 'R package ggmotif' authors: [] lastmod: '2022-06-30T22:00:06+02:00' featured: no image: caption: '' focal_point: '' preview_only: no projects: [] links: - icon: github icon_pack: fab name: GitHub url: https://github.com/lixiang117423/ggmotif

ggmotif: An R Package for the extraction and visualization of motifs from MEME software

![](https://img.shields.io/badge/release version-0.1.2-green.svg)

MEME Suit is a most used tool to identify motifs within deoxyribonucleic acid (DNA) or protein sequences. However, the results generated by the MEME Suit are saved using file formats, .xml and .txt, that are difficult to read, visualize or integrate with other wide used phylogenetic tree packages such as ggtree. To overcome this problem, we developed the ggmotif R package that provides a set of easy-to-use functions that can be used to facilitate the extraction and visualization of motifs from the results files generated by the MEME Suit. ggmotif can extract the information of the location of motif(s) on the corresponding sequence(s) from the .xml format file and visualize it. Additionally, the data extracted by ggmotif can be easily integrated with the phylogenetic data generated by ggtree. On the other hand, ggmotif can get the sequence of each motif from the .txt format file and draw the sequence logo with the function ggseqlogo from ggseqlogo R package.

Authors

Xiang LI

College of Plant Protection, Yunnan Agricultural University

https://www.web4xiang.top/

Identification of motifs

The demo data, the AP2 gene family of Arabidopsis thaliana, was downloaded from Plant Transcription Factor Database. The latest version MEME, v5.4.1, was used search motifs from the demo data using the fellow code:

meme ara.fa -protein -o meme_out -mod zoops -nmotifs 10 -minw 4 -maxw 7 -objfun classic -markov_order 0

The output files, htmlfile, txt file and xml file, can be found at GitHub.

Construction of phylogenetic tree

clustalo (V1.2.4) and FastTree (V2.1.10) were used to align the sequences and construct the phylogenetic tree.

clustalo -i ara.fa > ara.aligned.fa
FastTree ara.aligned.fa > ara.twk

The output files can be found at GitHub.

Installation and loading

Install the latest developmental version from GitHub as follow:

if(!require(devtools)) install.packages("devtools")
devtools::install_github("lixiang117423/ggmotif")

Or install from CRAN as follow:

install.packages("ggmotif")

Loading package

library(ggmotif)

Parse motif information from MEME results

The results generated by MEME Suit include a lot of files, including figures of each motif and three other files, a htmlfile, a txtfile and a xml file. The html file contain some figures of motifs. The txt file is for the sequences' information and the xml for other information including position, length, p-value and so on.

The main function of ggmotif is to parse the information and plot the position of each motif on the corresponding sequences.

Parse information

information of sequences of motifs

filepath <- system.file("examples", "meme.txt", package = "ggmotif")
motif.info <- getMotifFromMEME(data = filepath, format="txt")

information of other detail information of motifs

filepath <- system.file("examples", "meme.xml", package="ggmotif")
motif.info.2 <- getMotifFromMEME(data = filepath, format="xml")

Plot location

The figures from MEME only contain the location. It is difficult to combine the location figure to the corresponding phylogenetic tree. In ggmotif, the function motifLocation can visualize the location of each motif on its corresponding sequences, almost same as the html file. If user have the corresponding phylogenetic tree, the function can combine the tree and the location.

Without tree

filepath <- system.file("examples", "meme.xml", package = "ggmotif")
motif_extract <- getMotifFromMEME(data = filepath, format="xml")
motif_plot <- motifLocation(data = motif_extract)
motif_plot +
  ggsci::scale_fill_aaas()

ggplot2::ggsave(filename = "1.png", width = 6, height = 6, dpi = 300)

With tree

filepath <- system.file("examples", "meme.xml", package = "ggmotif")
treepath <- system.file("examples", "ara.nwk", package="ggmotif")
motif_extract <- getMotifFromMEME(data = filepath, format="xml")
motif_plot <- motifLocation(data = motif_extract, tree = treepath)
motif_plot +
  ggsci::scale_fill_aaas()

ggplot2::ggsave(filename = "2.png", width = 8, height = 6, dpi = 300)

show motif(s)

library(tidyverse)

filepath <- system.file("examples", "meme.txt", package = "ggmotif")
motif.info <- getMotifFromMEME(data = filepath, format = "txt")

# show one motif
motif.info %>%
  dplyr::select(2, 4) %>%
  dplyr::filter(motif.num == "Motif.2") %>%
  dplyr::select(2) %>%
  ggseqlogo::ggseqlogo() +
  theme_bw()

filepath <- system.file("examples", "meme.txt", package = "ggmotif")
motif.info <- getMotifFromMEME(data = filepath, format = "txt")

# show all motif
plot.list <- NULL

for (i in unique(motif.info$motif.num)) {
  motif.info %>%
    dplyr::select(2, 4) %>%
    dplyr::filter(motif.num == i) %>%
    dplyr::select(2) %>%
    ggseqlogo::ggseqlogo() +
    labs(title = i) +
    theme_bw() -> plot.list[[i]]
}

cowplot::plot_grid(plotlist = plot.list, ncol = 2)

Compare with other tools

The widely used R packages memes and universalmotif can process .txt files generated by MEME, but the extracted information does not have the location information of motifs.

library(tidyverse)
library(memes)

meme.res = memes::importMeme("meme.txt",combined_sites = TRUE)

# table from memes::importMeme function
meme.res[["meme_data"]] %>% 
  dplyr::select_if(~ !any(is.na(.))) %>% 
  dplyr::select(-bkg,-motif)

meme.res[["combined_sites"]]

uni.res = universalmotif::read_meme("meme.txt")
uni.res[[1]]

And the above functions can not handle .xml file.

memes::importMeme("meme.xml")

Error in convert_motifs(motifs) : Input is an empty list

universalmotif::read_meme("meme.xml")

Session Info

sessionInfo()

Contributing

We welcome any contributions!

lixiang117423/ggmotif documentation built on Aug. 14, 2022, 5:32 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lixiang117423/ggmotif
Extract and Visualize Motif Information from MEME Software

In lixiang117423/ggmotif: Extract and Visualize Motif Information from MEME Software

ggmotif: An R Package for the extraction and visualization of motifs from MEME software

Authors

Identification of motifs

Construction of phylogenetic tree

Installation and loading

Parse motif information from MEME results

Parse information

information of sequences of motifs

information of other detail information of motifs

Plot location

Without tree

With tree

show motif(s)

Compare with other tools

Session Info

Contributing

R Package Documentation

Browse R Packages

We want your feedback!

lixiang117423/ggmotif Extract and Visualize Motif Information from MEME Software

In lixiang117423/ggmotif: Extract and Visualize Motif Information from MEME Software

ggmotif: An R Package for the extraction and visualization of motifs from MEME software

Authors

Identification of motifs

Construction of phylogenetic tree

Installation and loading

Parse motif information from MEME results

Parse information

information of sequences of motifs

information of other detail information of motifs

Plot location

Without tree

With tree

show motif(s)

Compare with other tools

Session Info

Contributing

R Package Documentation

Browse R Packages

We want your feedback!

lixiang117423/ggmotif
Extract and Visualize Motif Information from MEME Software