apply_consensus_cutoff: Applies the consensus cutoff to an existing consensuses file

Description Usage Arguments

View source: R/consensus_cutoff.R

Description

Reads in a consensuses file and a pid_seq_name file. The largest bin is computed from the data in the pid_seq_name file. Using the information about the largest bin and the supplied sequencing error rate and PID length the consensus cutoff is computed. From the PIDs in the pid_seq_name file, a list of all PIDs associated with bins larger than the consensus cutoff is made. The sequences in the consensuses file is compared to that list and the ones that are based on small bins are removed from the file. The consensuses are then written out to a file with the same name as the input file, but the suffix cc added to it.

Usage

1
2
3
apply_consensus_cutoff(pid_seq_name_file, consensuses_file,
  sequencing_error_rate = 1/50, motif_length = 8,
  consensus_cutoff = "model")

Arguments

pid_seq_name_file

The name of the file that contains the list of sequence_names and the pids found in them

consensuses_file

The name of the file with the consensus sequences.

sequencing_error_rate

The sequencing error rate to use when looking up the consensus cutoff model (only required if 'consensus_cutoff' is not specified)

motif_length

The length of the pid.

consensus_cutoff

The consensus_cutoff number to use. If this is specified, that number will be used to choose which sequences to use. Alternatively if it is set to 'model' (the default) then a consensus cutoff will be computed based on the size of the largest bin and the sequencing_error_rate.


HIVDiversity/MotifBinner2 documentation built on May 6, 2019, 6:44 p.m.