readInterestResults: Read interest/interest.sequential results text files

View source: R/readInterestResults.R

readInterestResultsR Documentation

Read interest/interest.sequential results text files

Description

Reads one or multiple text file results generated by the interest or interest.sequential functions and builds an object of SummarizedExperiment-class class.

Usage

readInterestResults(resultFiles, sampleNames, 
	sampleAnnotation, commonColumns, freqCol, scaledRetentionCol,
	scaleLength, scaleFragment, reScale=FALSE, geneIdCol, 
	repeatsTableToFilter=c())

Arguments

resultFiles

Vector of character strings which includes the path to the tab-separated files resulted by the interest function.

sampleNames

Vector of character strings which includes the name of the samples. It should be the same size as the resultFiles parameter.

sampleAnnotation

Data frame with the same row number as the size of resultFiles and sampleNames parameter. The column names represent the annotation names and values in each column represent the annotations of the samples.

commonColumns

Columns in the result file which include intron/exon annotations and are common across all files defined in resultFiles.

freqCol

Column in the result file which include the read counts for introns/exons.

scaledRetentionCol

Column in the result file which include the scaled retention values for introns/exons.

scaleLength

Logical value, indicating whether the intron/exon retention levels are scaled to the length of the introns/exons. If reScale is TRUE the scaled retention levels would be rescalculated when reading the data.

scaleFragment

Logical value, indicating whether the intron/exon retention levels are scaled to the fragments mapped to the genes. If reScale is TRUE the scaled retention levels would be rescalculated when reading the data.

reScale

Logical value, indicating whether the scaled retention levels would be rescalculated when reading the data. By default it does not calculate and trusts the user to set the scaleLength and scaleFragment parameters correctly, i.e. as it was set in the interest() or interest.sequential() analysis.

geneIdCol

The number or name of the column in resultFiles which represents the gene/transcript names. It would be used for summing up the number of mapped fragments to the genes when scaling the retention levels. It is only used if reScale and scaleFragment arguments are set TRUE.

repeatsTableToFilter

A data.frame table with similar stucture to the reference. It includes chr, begin, and end columns. If defined, all reads mapped to the described regions would be ingnored and the Intron/exon lengths would be corrected to exclude the to exclude the regions with repetitive DNA sequences. See getRepeatTable. It is only used if reScale and scaleLength arguments are set TRUE.

Value

An object of calss SummarizedExperiment-class.

Author(s)

Ali Oghabian

See Also

interest, InterestResult.

Examples



geneId<- paste("gene", c(rep(1,7), rep(2,7), rep(3,7), rep(4,7)), 
	sep="_")
readCnt1<- sample(1:100, 28)
readCnt2<- sample(1:100, 28)
readCnt3<- sample(1:100, 28)
readCnt4<- sample(1:100, 28)
fpkm1<- readCnt1/(tapply(readCnt1, geneId, sum))[geneId]
fpkm2<- readCnt2/(tapply(readCnt2, geneId, sum))[geneId]
fpkm3<- readCnt3/(tapply(readCnt3, geneId, sum))[geneId]
fpkm4<- readCnt4/(tapply(readCnt4, geneId, sum))[geneId]

#Create tmp director
tmpDir=file.path(tempdir(),"InterestResult")
dir.create(tmpDir)

# Build text files similar to files resulted by interest
dfTmp=data.frame( 
		int_ex=rep(c(rep(c("exon","intron"),3),"exon"),4),
		int_ex_num= rep(c(1,1,2,2,3,3,4),4),         
		int_type=rep(c(NA,"U2",NA,"U12",NA,"U2",NA),4),
		strand=rep("*",28),
		gene_id= geneId,
		sam1_readCnt=readCnt1,
		sam2_readCnt=readCnt2,
		sam3_readCnt=readCnt3,
		sam4_readCnt=readCnt4,
		sam1_fpkm=fpkm1,
		sam2_fpkm=fpkm2,
		sam3_fpkm=fpkm3,
		sam4_fpkm=fpkm4
)

writeDf<-function(df, file){
	write.table(df, file, col.names=TRUE, 
		row.names=FALSE, quote=FALSE, sep='\t')
}

writeDf(dfTmp[, c(1:5,6,10)], paste(tmpDir, "df1.tsv", sep="/"))
writeDf(dfTmp[, c(1:5,7,11)], paste(tmpDir, "df2.tsv", sep="/"))
writeDf(dfTmp[, c(1:5,8,12)], paste(tmpDir, "df3.tsv", sep="/"))
writeDf(dfTmp[, c(1:5,9,13)], paste(tmpDir, "df4.tsv", sep="/"))

# Build object from generated text file results
testObj<-readInterestResults(
	resultFiles=paste(tmpDir, 
		c("df1.tsv", "df2.tsv", "df3.tsv", "df4.tsv"), sep="/"), 
	sampleNames=c("sam1","sam2","sam3","sam4"), 
	sampleAnnotation= data.frame( gender=c("M","M","F","F"),
		health=c("healthy","unhealthy","healthy","unhealthy")), 
	commonColumns=1:5, freqCol=6, scaledRetentionCol=7, 
	scaleLength=FALSE, scaleFragment=TRUE, reScale=FALSE)

#View object
testObj



gacatag/IntEREst documentation built on July 29, 2024, 1:12 a.m.