IntEREst: Intron-Exon Retention Estimator

readInterestResults

R Documentation

Read interest/interest.sequential results text files

Description

Reads one or multiple text file results generated by the interest or interest.sequential functions and builds an object of SummarizedExperiment-class class.

Usage

readInterestResults(resultFiles, sampleNames, 
	sampleAnnotation, commonColumns, freqCol, scaledRetentionCol,
	scaleLength, scaleFragment, reScale=FALSE, geneIdCol, 
	repeatsTableToFilter=c())

Arguments

`resultFiles`	Vector of character strings which includes the path to the tab-separated files resulted by the `interest` function.
`sampleNames`	Vector of character strings which includes the name of the samples. It should be the same size as the `resultFiles` parameter.
`sampleAnnotation`	Data frame with the same row number as the size of `resultFiles` and `sampleNames` parameter. The column names represent the annotation names and values in each column represent the annotations of the samples.
`commonColumns`	Columns in the result file which include intron/exon annotations and are common across all files defined in `resultFiles`.
`freqCol`	Column in the result file which include the read counts for introns/exons.
`scaledRetentionCol`	Column in the result file which include the scaled retention values for introns/exons.
`scaleLength`	Logical value, indicating whether the intron/exon retention levels are scaled to the length of the introns/exons. If `reScale` is `TRUE` the scaled retention levels would be rescalculated when reading the data.
`scaleFragment`	Logical value, indicating whether the intron/exon retention levels are scaled to the fragments mapped to the genes. If `reScale` is `TRUE` the scaled retention levels would be rescalculated when reading the data.
`reScale`	Logical value, indicating whether the scaled retention levels would be rescalculated when reading the data. By default it does not calculate and trusts the user to set the `scaleLength` and `scaleFragment` parameters correctly, i.e. as it was set in the `interest()` or `interest.sequential()` analysis.
`geneIdCol`	The number or name of the column in `resultFiles` which represents the gene/transcript names. It would be used for summing up the number of mapped fragments to the genes when scaling the retention levels. It is only used if `reScale` and `scaleFragment` arguments are set `TRUE`.
`repeatsTableToFilter`	A data.frame table with similar stucture to the `reference`. It includes `chr`, `begin`, and `end` columns. If defined, all reads mapped to the described regions would be ingnored and the Intron/exon lengths would be corrected to exclude the to exclude the regions with repetitive DNA sequences. See `getRepeatTable`. It is only used if `reScale` and `scaleLength` arguments are set `TRUE`.

Value

An object of calss SummarizedExperiment-class.

Author(s)

Ali Oghabian

Examples



geneId<- paste("gene", c(rep(1,7), rep(2,7), rep(3,7), rep(4,7)), 
	sep="_")
readCnt1<- sample(1:100, 28)
readCnt2<- sample(1:100, 28)
readCnt3<- sample(1:100, 28)
readCnt4<- sample(1:100, 28)
fpkm1<- readCnt1/(tapply(readCnt1, geneId, sum))[geneId]
fpkm2<- readCnt2/(tapply(readCnt2, geneId, sum))[geneId]
fpkm3<- readCnt3/(tapply(readCnt3, geneId, sum))[geneId]
fpkm4<- readCnt4/(tapply(readCnt4, geneId, sum))[geneId]

#Create tmp director
tmpDir=file.path(tempdir(),"InterestResult")
dir.create(tmpDir)

# Build text files similar to files resulted by interest
dfTmp=data.frame( 
		int_ex=rep(c(rep(c("exon","intron"),3),"exon"),4),
		int_ex_num= rep(c(1,1,2,2,3,3,4),4),         
		int_type=rep(c(NA,"U2",NA,"U12",NA,"U2",NA),4),
		strand=rep("*",28),
		gene_id= geneId,
		sam1_readCnt=readCnt1,
		sam2_readCnt=readCnt2,
		sam3_readCnt=readCnt3,
		sam4_readCnt=readCnt4,
		sam1_fpkm=fpkm1,
		sam2_fpkm=fpkm2,
		sam3_fpkm=fpkm3,
		sam4_fpkm=fpkm4
)

writeDf<-function(df, file){
	write.table(df, file, col.names=TRUE, 
		row.names=FALSE, quote=FALSE, sep='\t')
}

writeDf(dfTmp[, c(1:5,6,10)], paste(tmpDir, "df1.tsv", sep="/"))
writeDf(dfTmp[, c(1:5,7,11)], paste(tmpDir, "df2.tsv", sep="/"))
writeDf(dfTmp[, c(1:5,8,12)], paste(tmpDir, "df3.tsv", sep="/"))
writeDf(dfTmp[, c(1:5,9,13)], paste(tmpDir, "df4.tsv", sep="/"))

# Build object from generated text file results
testObj<-readInterestResults(
	resultFiles=paste(tmpDir, 
		c("df1.tsv", "df2.tsv", "df3.tsv", "df4.tsv"), sep="/"), 
	sampleNames=c("sam1","sam2","sam3","sam4"), 
	sampleAnnotation= data.frame( gender=c("M","M","F","F"),
		health=c("healthy","unhealthy","healthy","unhealthy")), 
	commonColumns=1:5, freqCol=6, scaledRetentionCol=7, 
	scaleLength=FALSE, scaleFragment=TRUE, reScale=FALSE)

#View object
testObj

gacatag/IntEREst documentation built on June 12, 2025, 6:21 p.m.