FiltDeepSignal | R Documentation |
Filter out data from contigs or Modifications that do not reach criterias of selection. Can also be used to obtain a gposDeepSignalMod object by simply filtering target sites which have a fraction above 0.
FiltDeepSignal( gposDeepSignalModBase = NULL, gposDeepSignalMod = NULL, cContigToBeRemoved = NULL, dnastringsetGenome, nContigMinSize = -1, listPctSeqByContig, nContigMinPctOfSeq = -1, listMeanCovByContig, nContigMinCoverage = -1, cParamNameForFilter = NULL, nFiltParamLoBoundaries = NULL, nFiltParamUpBoundaries = NULL, cFiltParamBoundariesToInclude = NULL, listMeanParamByContig = NULL, nContigFiltParamLoBound = NULL, nContigFiltParamUpBound = NULL, nModMinCoverage = NULL )
gposDeepSignalModBase |
An UnStitched GPos object containing DeepSignal modification target sites data to be filtered. Defaults to NULL. |
gposDeepSignalMod |
An UnStitched GPos object containing DeepSignal modified sites data to be filtered. Defaults to NULL. |
cContigToBeRemoved |
Names of contigs for which the data will be removed. gposPacBioCSV must be provided if using this argument. Defaults to NULL. |
dnastringsetGenome |
A DNAStringSet object containing the sequence for each contig. |
nContigMinSize |
Minimum size for contigs to keep. Contigs with a size below this value will be removed. gposPacBioCSV must be provided if using this argument. Defaults to -1 (= no filter). |
listPctSeqByContig |
List containing, for each strand, the percentage of sequencing for each contig. This list must be composed of 2 dataframes (one by strand) called f_strand and r_strand. In each dataframe, "refName" column returning names of contigs and "seqPct" column returning percentage of sequencing. gposPacBioCSV must be provided if using this argument. |
nContigMinPctOfSeq |
Minimum percentage of sequencing for contigs to keep. Contigs with a percentage below this value will be removed. gposPacBioCSV must be provided if using this argument. Defaults to 95. |
listMeanCovByContig |
List containing, for each strand, the mean of coverage for each contig. This list must be composed of 2 dataframes (one by strand) called f_strand and r_strand. In each dataframe, "refName" column returning names of contigs and "mean_coverage" column returning mean of coverage. gposPacBioCSV must be provided if using this argument. |
nContigMinCoverage |
Minimum mean coverage for contigs to keep. Contigs with a mean coverage below this value will be removed. gposPacBioCSV must be provided if using this argument. Defaults to 20. |
cParamNameForFilter |
A character vector giving the name of the parameter to be filtered. Must correspond to the name of one column in the object provided with grangesModPos. |
nFiltParamLoBoundaries |
A numeric vector returning the lower boundaries of intervals. Must have the same length as "nFiltParamUpBoundaries". Defaults to NULL. If this parameter is provided, the function will remove modifications which have values of the given parameter that are not included in the intervals provided with "nFiltParamLoBoundaries" and "nFiltParamUpBoundaries". |
nFiltParamUpBoundaries |
A numeric vector returning the upper boundaries of intervals. Must have the same length as "nFiltParamLoBoundaries". Defaults to NULL. If this parameter is provided, the function will remove modifications which have values of the given parameter that are not included in the intervals provided with "nFiltParamLoBoundaries" and "nFiltParamUpBoundaries". |
cFiltParamBoundariesToInclude |
A character vector describing which interval boundaries must be included in the intervals provided. Can be "upperOnly" (only upper boundaries), "lowerOnly" (only lower boundaries), "both" (both upper and lower boundaries) or "none" (do not include upper and lower boundaries). If NULL, both upper and lower boundaries will be included (= "both"). Defaults to NULL. cFiltParamBoundariesToInclude = NULL #can be "upperOnly","lowerOnly","both", "none' (NULL = "both" for all) |
listMeanParamByContig |
List containing, for each strand, the mean of a given parameter for each contig. This list must be composed of 2 dataframes (one by strand) called f_strand and r_strand. In each dataframe, "refName" column returning names of contigs and "mean_"[parameter name] column returning the mean of the given parameter. If not NULL, remove contigs that are too far away from the mean of the Parameter of all contigs (which are not included in the interval centered on the mean) in the list provided. Defaults to NULL. |
nContigFiltParamLoBound |
A numeric value to be removed from the mean of the given parameter of all contigs (calculates the lower bound of the interval centered on the mean). Defaults to NULL. |
nContigFiltParamUpBound |
A numeric value to be added to the mean of the given parameter of all contigs (calculates the upper bound of the interval centered on the mean). Defaults to NULL. |
nModMinCoverage |
Minimum coverage for all Modifications to be kept. Modifications with a coverage below this value will be removed. Defaults to NULL (no filter). |
# Loading Nanopore data myDeepSignalModPath <- system.file( package = "DNAModAnnot", "extdata", "FAB39088-288418386-Chr1.CpG.call_mods.frequency.tsv" ) mygposDeepSignalModBase <- ImportDeepSignalModFrequency( cDeepSignalModPath = myDeepSignalModPath, lSortGPos = TRUE, cContigToBeAnalyzed = "all" ) mygposDeepSignalModBase # Filtering mygposDeepSignalMod <- FiltDeepSignal( gposDeepSignalModBase = mygposDeepSignalModBase, cParamNameForFilter = "frac", nFiltParamLoBoundaries = 0, nFiltParamUpBoundaries = 1, cFiltParamBoundariesToInclude = "upperOnly" )$Mod mygposDeepSignalMod
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.