amplicanNormalize: Remove events that can be found in Controls.

View source: R/amplicanNormalize.R

amplicanNormalizeR Documentation

Remove events that can be found in Controls.

Description

This function can adjust events for small differences between known annotations (amplicon sequences) and real DNA of the strain that was sequenced. Events from the control are grouped by add and their frequencies are calculated in respect to number of total reads in that groups. In next step events from the control are filtered according to min_freq, all events below are treated as sequencing errors and rejected. Finally, all events that can be found in treatment group that find their exact match (by non skipped columns) in control group are removed. All events from control group are returned back.

Usage

amplicanNormalize(
  aln,
  cfgT,
  add = c("guideRNA", "Group"),
  skip = c("counts", "score", "seqnames", "read_id", "strand", "overlaps", "consensus"),
  min_freq = 0.01
)

Arguments

aln

(data.frame) Contains events from alignments.

cfgT

(data.frame) Config table with information about experiments.

add

(character vector) Columns from cfgT that should be included in event table for normalization matching. Defaults to c("guideRNA", "Group") , which means that only those events created by the same guideRNA in the same Group will be removed if found in Control.

skip

(character vector) Specifies which columns of aln to skip.

min_freq

(numeric) All events from control group below this frequency will be not included in filtering. Use this to filter out background noise and sequencing errors.

Value

(data.frame) Same as aln, but events are normalized. Events from Control are not changed. Additionally columns from add are added to the data.frame.

See Also

Other analysis steps: amplicanAlign(), amplicanConsensus(), amplicanFilter(), amplicanMap(), amplicanOverlap(), amplicanPipelineConservative(), amplicanPipeline(), amplicanReport(), amplicanSummarize()

Examples

aln <- data.frame(seqnames = 1:5, start = 1, end = 2, width = 2,
                  counts = 101:105)
cfgT <- data.frame(ID = 1:5, guideRNA = rep("ACTG", 5),
                   Reads_Filtered = c(2, 2, 3, 3, 4),
                   Group = c("A", "A", "B", "B", "B"),
                   Control = c(TRUE, FALSE, TRUE, FALSE, FALSE))
# all events are same as in the control group, therefore are filtered out
# events from control groups stay
amplicanNormalize(aln, cfgT)
# events that are different from control group are preserved
aln[2, "start"] <- 3
amplicanNormalize(aln, cfgT)


valenlab/amplican documentation built on Jan. 28, 2024, 5:10 a.m.