transcriptInfo: Manage information about transcript reference

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Manage information about the transcript reference. These functions are used for reading, saving and updating transcript information DataFrame.

Usage

1
2
3
4
tri.load(trInfoFile)
tri.save(trInfo, trInfoFile)
tri.hasGeneNames(trInfo)
tri.setGeneNames(trInfo, geneNames, transcriptNames=NULL)

Arguments

trInfoFile

Name of the file containing transcript information or the where the information should be stored.

trInfo

DataFrame containing the transcript information.

geneNames

Vector with new gene names that should be assigned to transcripts.

transcriptNames

Names of transcripts that should be associated with the gene names.

Details

If not provided with the information, BitSeq extracts information about the transcript reference from the alignment and sequence files. This information is stored in so called transcript information(trInfo) file, usually having extension .tr. This file contains columns with gene names (if available), transcript names, transcript lengths and optionally with adjusted lengths of transcripts. The expression of transcripts is reported in the same order as are the transcripts ordered in the trInfo file, hence it serves as identification of final results.

Other important use of trInfo file is for calculating gene expression or within gene expression, where the file is used for determining which transcripts belong to which genes. However, for this the gene names have to be properly set in the transcript info, which is not always the case.

Function tri.load loads transcript information from a file provided by argument trInfoFile into a DataFrame.

Function tri.save saves transcript information from a DataFrame provided by trInfo argument into a file name provided by argument trInfoFile.

Function tri.hasGeneNames determines whether gene names are properly set in the transcript information and returns TRUE or FALSE and a warning message identifying the problem.

Function tri.setGeneNames changes gene names of a transcript information trInfo and retruns new DataFrame with updated values. The vector geneNames should provide gene names of transcripts and be of the same length as is the number of transcripts. The gene names have to be either ordered as their appropriate transcripts in trInfo object, or if ordered differently, vector of transcript names, ordered as gene names has to be provided by argument transcriptNames. The names in transcriptNames have to correspond to the transcript names in trInfo object.

Value

Function tri.load returns DataFrame with transcript information.

Function tri.hasGeneNames returns boolean value.

Function tri.setGeneNames returns DataFrame with transcript information containing updated gene names (Note: the transcript names do not change.).

Author(s)

Peter Glaus

See Also

getExpression, getGeneExpression, tri.file.setGeneNames

Examples

1
2
3
4
5
6
7
8
9
setwd(system.file("extdata",package="BitSeq"))
trinfo <- tri.load("ensSelect1.tr")
trinfo[1:10,]
## this should be true
tri.hasGeneNames(trinfo)
## reverse the gene order - this will make the information INCORRECT
rev.trinfo <- tri.setGeneNames(trinfo, rev(trinfo[,1]))
rev.trinfo[1:10,]
tri.save(rev.trinfo, "reversed-ensSelect1.tr")

BitSeq documentation built on Nov. 8, 2020, 5:25 p.m.