processSeq: Process Sequencing Data for Poisson-based MRFs
In XMRF: Markov Random Fields for High-Throughput Genetics Data

Description Usage Arguments Details Value Examples

Process and normalize RNA-Sequencing count data into a distribution appropriate for Poisson MRFs.

1 2	processSeq(X, quanNorm = 0.75, nLowCount = 20, percentLowCount = 0.95, NumGenes = 500, PercentGenes = 0.1)

`X`	nxp data matrix.
`quanNorm`	an optional parameter controlling the quantile for sample normalization, default to 0.75.
`nLowCount`	minimum read count to decide if to filter a gene, default to 20.
`percentLowCount`	filter out a gene if it has this percentage of samples less than `nLowCount`, default to 0.95.
`NumGenes`	number of genes to retain in the final data set, default to 500.
`PercentGenes`	percentage of genes to retain, default to 0.1.

To process the next-generation sequencing count data into proper distribution (with dispersion removed), the following steps are taken in this function:

Quantile normalization for the samples.
Filter out genes with all low counts.
Filter genes by maximal variance (if specified).
Transform the data to be closer to the Poisson distribution. A log or power transform is considered and selected based upon the Kolmogorov-Smirnov goodness of fit test.

a n x NumGenes or PercentGenes processed data matrix.