FeatureExtract: Feature encoding

Description Usage Arguments Value Author(s) Examples

View source: R/seq.R

Description

This function contains three feature encoding scheme, binary, k-mer and PseDNC. For binary encoding scheme, a vector of 404 (4*101) features is generated through assigning 'A', 'C', 'G', 'U' and 'N' with (1,0,0,0), (0,1,0,0), (0,0,1,0), (0,0,0,1) and (0,0,0,0), respectively. Here 'N' is a gap used to ensure the fixed features of each sample, if an m6A/non- m6A site occurs near the initiation or termination of the transcript. For K-mer encoding, the composition of short sequence with different lengths was considered to encoding samples. For PseDNC (pseudo dinucleotide composition) encoding, the local and global sequence-order information along the RNA sequence was used for scoring the each sample.

Usage

1
  FeatureExtract(RNAseq, lambda = 6, w = 0.9)

Arguments

RNAseq

A list containing the FASTA format sequences.

lambda

The lambda parameter for the PseDNC-related features, default is 6.

w

The weighting parameter for PseDNC-related features, default is 0.9.

Value

A matrix with features.

Author(s)

Jie Song, Jingjing Zhai, Chuang Ma

Examples

1
2
3
4
aaa <- extra_motif_seq(input_seq_dir = paste0(system.file(package = "PEAm5c"),"/data/cdna.fa"),up = 5)
aaa <- lapply(aaa, c2s)
bbb <- FeatureExtract(aaa)
bbb[1:10,]

cma2015/PEA-m5C documentation built on May 17, 2019, 8:05 a.m.