We developed PEA-m5C, an accurate transcriptome-wide m5C modification predictor under machine learning framework with random forest algorithm. PEA-m5C was trained with features from the flanking sequences of m5C modifications. In addition, we also deposited all the candidate m5C modification sites in the Ara-m5C database (http://bioinfo.nwafu.edu.cn/software/Ara-m5C.html) for follow-up functional mechanism researches. Finally, in order to maximize the usage of PEA-m5C, we implement it into a cross-platform, user-friendly and interactive interface and an R package named “PEA-m5C” based R statistical language and JAVA programming language, which may advance functional researches of m5C.
## Install rJAVA
sudo apt-get update
sudo apt-get install r-cran-rjava r-cran-rweka
## Install R Dependency
dependency.packages <- c("randomForest", "seqinr", "stringr", "FSelector", "bigmemory", "ggplot2", "PRROC", "pROC")
install.packages(dependency.packages)
install.packages("Download path/PEAm5C_0.11.tar.gz",repos = NULL, type = "source")
The basic data set can be finded in data. More details can be seen from user manual.
seq <- extra_motif_seq(input_seq_dir = paste0(system.file(package = "PEAm5c"),"/data/cdna.fa"),up = 5)
seq <- lapply(seq, c2s)
seq_feature <- FeatureExtract(seq)
res <- predict_m5c(seq_feature)
load(paste0(system.file(package = "PEAm5c"),"/data/samples.Rds"))
### The positive and negative sequence can be read and identified by extra_motif_seq and feature encoding by FeatureExtract
seq <- PEA_ml(pos_sample = pos_sample,neg_sample = neg_sample)
model <- extra_model(res = seq)
model
res <- predict_self_model(models = model,sequence_dir = paste0(system.file(package = "PEAm5c"),"/data/cdna.fa"))
table(res[,4])
Song, J., Zhai, J., Bian, E., Song, Y., Yu, J., & Ma, C. (2018). Transcriptome-Wide Annotation of m5C RNA Modifications Using Machine Learning. Frontiers in plant science, 9, 519.
Please use PEAm5C/issues for how to use PEAm5C and reporting bugs.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.