readPACds: Read a PACdataset
In BMILAB/movAPA: movAPA: Modeling and Visualization of Dynamics of Alternative PolyAdenylation

readPACds

R Documentation

Read a PACdataset

Description

readPACds reads PAC counts and sample annotation into a PACdataset.

Usage

readPACds(pacFile, colDataFile = NULL, noIntergenic = TRUE, PAname = "PA")

Arguments

`pacFile`	a file name or a data frame/matrix/dgCMatrix. If it is a file, it should have header, with at least (chr, strand, coord) columns. This file could have other columns, including gff cols (gene/gene_type/ftr/ftr_start/ftr_end/UPA_start/UPA_end) and user-defined sample columns. If there are at least one non-numeric columns other than above gff cols, then all remaining columns are considered as annotation columns. If all remaining columns are numeric, then they are all treated as sample columns. Use annotatePAC() first if need genome annotation of coordinates.
`colDataFile`	a file name or a data frame. If it is a file, then it is an annotation file of samples with header, rownames are samples (must be all in pacFile), columns names are sample groups. There could be single or multiple columns to define the groups of samples. When colDataFile=NULL, then readPACds will automately retreive sample columns and gff columns (if any) from pacFile. If there is no sample columns, then will set colData as a data frame with 1 column (=group) and 1 row (=tag), and element=group1. If pacfile or colDataFile is a character, then it is a file name, so readPACds will read data from file.
`noIntergenic`	TRUE/FALSE. If TRUE, then will remove PACs in intergenic (ftr='^inter')
`PAname`	specify how to set the name (rowname) of PACs. PAname=PA, the PA name is set as 'gene:PAN'; PAname=coord, then 'gene:coord'.

Value

A PACdataset object, with @anno being a data frame with at least three columns chr/strand/coord. If there is no sample column, then will add one sample named tag in group1.

Examples

data(PACds)
## read simple PACfile that only has columns chr/strand/coord
pacFile=PACds@anno[,c('chr','strand','coord')]
colDataFile=NULL
p=readPACds(pacFile, colDataFile)
## read PACfile that has columns chr/strand/coord and sample columns
pacFile=PACds@anno[,c('chr','strand','coord')]
pacFile=cbind(pacFile, PACds@counts[,c('anther1','embryo1','anther2')])
colDataFile=NULL
p=readPACds(pacFile, colDataFile)
p@colData; head(p@counts)
## read PACfile that has columns chr/strand/coord, sample columns, and gff cols
## like gene/gene_type/ftr/ftr_start/ftr_end/UPA_start/UPA_end
pacFile=PACds@anno
pacFile=cbind(pacFile, PACds@counts[,c('anther1','embryo1','anther2')])
colDataFile=NULL
p=readPACds(pacFile, colDataFile)
p@colData; head(p@counts); head(p@anno)
pacFile=PACds@anno
smps=c('anther1','embryo1','anther2')
pacFile=cbind(pacFile, PACds@counts[,smps])
colDataFile=.asDf(matrix(c('group1','group2','group1'),
             ncol=1, dimnames=list(smps, 'group')))
p=readPACds(pacFile, colDataFile)
p@colData; head(p@counts); head(p@anno)
write.table(pacFile, file='pacFile', row.names=FALSE)
write.table(colDataFile, file='colDataFile', row.names=TRUE)
p=readPACds(pacFile='pacFile',
            colDataFile='colDataFile', noIntergenic=TRUE, PAname='PA')

BMILAB/movAPA documentation built on Jan. 3, 2024, 11:09 p.m.