RNAfold2df: Function to parse the output from RNAfold 2.4.18 into data...

View source: R/RNAfold2df.R

RNAfold2dfR Documentation

Function to parse the output from RNAfold 2.4.18 into data frame and write to tab-delimited file parse output according to https://www.tbi.univie.ac.at/RNA/RNAfold.1.html#heading5 this description is based on manual entry of a single sequence – "Here, the first line just repeats the sequence input. The second line contains a MFE structure in dot bracket notation followed by the minimum free energy. After this, the pairing probabilities for each nucleotide are shown in a pseudo dot-bracket notation followed by the free energy of ensemble. The next two lines show the centroid structure with its free energy and its distance to the ensemble as well as the MEA structure, its free energy and the maximum expected accuracy, respectively. The last line finally contains the frequency of the MFE representative in the complete ensemble of secondary structures and the ensemble diversity." the output for multiple sequences contains same info formatted slightly differently, figured it out and parsed accordingly. writes a tab-delimited file with the in the format 'basename of input filedf.txt'

Description

Function to parse the output from RNAfold 2.4.18 into data frame and write to tab-delimited file parse output according to https://www.tbi.univie.ac.at/RNA/RNAfold.1.html#heading5 this description is based on manual entry of a single sequence – "Here, the first line just repeats the sequence input. The second line contains a MFE structure in dot bracket notation followed by the minimum free energy. After this, the pairing probabilities for each nucleotide are shown in a pseudo dot-bracket notation followed by the free energy of ensemble. The next two lines show the centroid structure with its free energy and its distance to the ensemble as well as the MEA structure, its free energy and the maximum expected accuracy, respectively. The last line finally contains the frequency of the MFE representative in the complete ensemble of secondary structures and the ensemble diversity." the output for multiple sequences contains same info formatted slightly differently, figured it out and parsed accordingly. writes a tab-delimited file with the in the format 'basename of input filedf.txt'

Usage

RNAfold2df(file, writedir = getwd())

chris-hsiung/bears01 documentation built on April 9, 2024, 2:01 a.m.