formatSeq: Format RNA/Protein Sequences According to the Index

View source: R/Modelling.R

formatSeqR Documentation

Format RNA/Protein Sequences According to the Index

Description

This function generates a list of sequences according to the specified indices. The sequence list can be used as input for feature extraction or prediction.

Usage

formatSeq(idx, seqs)

Arguments

idx

specifying the sequence indices.

seqs

sequences loaded by function read.fasta from seqinr-package. Or a list of sequences.

Details

This function is useful for formatting the sequences using the specified indices (or names) and a sequence list. For example, the names of RNA-protein interaction pairs have been provided, but the sequences are randomly listed in one file. This function can generate a list containing the sequences whose names are listed in idx object. See examples below.

Value

This function returns a list.

Examples

data(demoIDX)
data(demoPositiveSeq)

new_RNA <- formatSeq(demoIDX$RNA_index, demoPositiveSeq$RNA.positive)
new_Pro <- formatSeq(demoIDX$Pro_index, demoPositiveSeq$Pro.positive)

names(demoPositiveSeq$Pro.positive)
names(demoPositiveSeq$RNA.positive)

names(new_RNA)
names(new_Pro)

# new_RNA and new_Pro can be further used to extract features.



HAN-Siyu/ncProR documentation built on Nov. 3, 2023, 12:08 a.m.