View source: R/scAPAtrap_funlib.R
reducePeaks | R Documentation |
Reduce peaks in a countsfile or counts table and also remove same peaks in peaksfile (if provided), by min.cells/max.cells, and min.counts/max.counts. This function is useful to retrieve highly expressed, lowly expressed peaks or moderately expressed peaks.
reducePeaks(
countsfile,
peaksfile = NULL,
min.cells = 10,
min.count = 10,
max.cells = NULL,
max.count = NULL,
suffix = ".reduced",
toSparse = FALSE,
...
)
countsfile |
The decompressed file path of counts.tsv.gz generated by |
peaksfile |
peaksfile or peak table with five columns. If not NULL, then filter peaksfile after filtering countsfile. |
min.cells |
retain peaks expressed in >= min.cells, the default value is 10. |
min.count |
retain peaks with read count >= min.count, the default value is 10. |
max.cells |
retain peaks expressed in < max.cells, the default value is NULL (unlimited). This is used to filter peaks with less expression. |
max.count |
retain peaks with read count < max.count, the default value is NULL (unlimited). This is used to filter peaks with less expression. |
suffix |
applicable when countsfile and peaksfile are both provided. Then counts and peaks will be output to <countsfile>.reduced; <peaksfile>.reduced. |
toSparse |
to output a sparseMatrix (gene-cell) or keep the triplet table as input. |
... |
Arguments passed to other methods and/or advanced arguments. Advanced arguments:
|
A data.frame (toSparse=FALSE), or a sparse Matrix (toSparse=TRUE) of counts, or a filename list with (countsfile, peaksfile) (if peaksfile is not NULL).
## Not run:
countsfile='../dataFly/APA.tails.no/counts.tsv.gz'
peaksfile='../dataFly/APA.tails.no/peaks-notails.saf'
## only filter countsfile or counts-table, return a df
reducePeaks(countsfile, min.cells = 10, min.count = 10, toSparse=FALSE)
## retain large peaks, and output both counts and peaks, save to .reduced file (>=10 & >=50)
reducePeaks(countsfile=countsfile, peaksfile=peaksfile, min.cells = 10, min.count = 50)
## retain low-expressed peaks, and output both counts and peaks, save to .reduced file (<=9 & <=49)
reducePeaks(countsfile=countsfile, peaksfile=peaksfile, max.cells = 9, max.count = 49,
min.cells=NULL, min.count=NULL, suffix='.small')
<=9
reducePeaks(countsfile=countsfile, peaksfile=peaksfile, max.cells = 9, max.count=NULL,
min.cells=NULL, min.count=NULL, suffix='.smallcells')
<=49
reducePeaks(countsfile=countsfile, peaksfile=peaksfile, max.cells = NULL, max.count=49,
min.cells=NULL, min.count=NULL, suffix='.smallcounts')
smallpeaks=.loadPeaks('../dataFly/APA.tails.no/peaks-notails.saf.small')
largepeaks=.loadPeaks('../dataFly/APA.tails.no/peaks-notails.saf.reduced')
smallpeaks1=.loadPeaks('../dataFly/APA.tails.no/peaks-notails.saf.smallcells')
smallpeaks2=.loadPeaks('../dataFly/APA.tails.no/peaks-notails.saf.smallcounts')
fullpeaks=.loadPeaks(peaksfile)
nrow(smallpeaks); nrow(largepeaks); nrow(fullpeaks)
smallset2=unique(rbind(smallpeaks1, smallpeaks2))
nrow(smallset2) + nrow(largepeaks); nrow(fullpeaks)
## should be the same, but if not the same, may be some peakIDs in fullpeaks are not in the counts table
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.