process_file: Processes a file into consensus bins
In philliplab/MotifBinner: Bin reads by Motif

Processes a file into consensus bins

process_file(file_name, output, prefix = "CCAGCTGGTTATGCGATTCTMARGTG",
  suffix = "CTGAGCGTGTGGCAAGGCCC", motif_length = 9,
  max.mismatch_start = 0, max.mismatch = 5, threshold = 8/600,
  start_threshold = 8/600, max_sequences = 400, remove_gaps = TRUE,
  strip_uids = TRUE, n_bins_to_process = 0, verbose = TRUE,
  prefix_for_names = "")

`file_name`	The file name
`prefix`	The prefix that is used to identify the motif
`suffix`	The suffix that is used to identify the motif
`motif_length`	The length of the motif that forms the pid.
`max.mismatch`	The maximum number of mismatches to allow when searching for the pid
`threshold`	Outlier sequences are removed from the bin until the maximum distance between any two sequences drops below this threshold.
`start_threshold`	Only start the classification if the maximum distance between and two sequences in the bin is greater than this.
`max_sequences`	The maximum number of sequences to use for the computation of the distance matrix. If more sequences than this is present, then randomly select this many sequences and run the classification algorithm on them. This is only to improve the computation speed.
`remove_gaps`	If set to TRUE (the default, then gaps will be removed from the consensus sequences)
`strip_uids`	Remove the unique identifiers from the sequence. It is not intelligent. The names will be split on '_' and the first and last pieces will be kept.
`n_bins_to_process`	The number of bins to process through the outlier detection, alignment and consensus generation. If smaller than or equal to 0, all bins will be processed.
`verbose`	Progress information will be provided if set to TRUE
`prefix_for_names`	Add this bit of text to the front of each sequence in the resulting consensus sequences.
`output_dir`	The directory where the output must be stored