n_filter: Remove sequences with non-identified bases (Ns) from a...

Description Usage Arguments Value Author(s) Examples

View source: R/n_filter.R

Description

This program is a wrapper to nFilter. It removes the sequences with a number of N's above a threshold value 'rm.N'. All the sequences with a number of N > rm.N (N >= rm.N) will be removed

Usage

1
n_filter(input, rm.N)

Arguments

input

ShortReadQ object

rm.N

Threshold value of N's to remove a sequence from the output (sequences with number of Ns > threshold are removed) For example, if rm.N is 3, all the sequences with a number of Ns > 3 (Ns >= 4) will be removed

Value

Filtered ShortReadQ object

Author(s)

Leandro Roser learoser@gmail.com

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
require('Biostrings')
require('ShortRead')

# create 6 sequences of width 20
set.seed(10)
input <- random_seq(50, 20)

# inject N's
set.seed(10)
input <- inject_letter_random(input, how_many_seqs = 1:30, 
how_many = 1:10)

input <- DNAStringSet(input)


# watch the N's frequency
hist(letterFrequency(input, 'N'), breaks = 0:10, 
main  = 'Ns Frequency', xlab = '# Ns')

# create qualities of width 20
set.seed(10) 
input_q <- random_qual(50, 20)

# create names
input_names <- seq_names(50)

# create ShortReadQ object
my_read <- ShortReadQ(sread = input, quality = input_q, id = input_names)

# apply the filter 
filtered <- n_filter(my_read, rm.N = 3)

# watch the filtered sequences
sread(filtered)

# watch the N's frequency
hist(letterFrequency(sread(filtered), 'N'), 
main = 'Ns distribution', xlab = '')

FastqCleaner documentation built on Nov. 8, 2020, 5:05 p.m.