extract_region: Extract regions from a set of sequences (maybe with...
In brendanf/tzara: Cluster long amplicons using dada2 denoising on variable regions

Extract regions from a set of sequences (maybe with qualities)

extract_region(
  seq,
  positions,
  region,
  region2 = region,
  outfile = NULL,
  append = FALSE,
  format,
  compress = NULL,
  ...
)

## S3 method for class 'character'
extract_region(
  seq,
  positions,
  region,
  region2 = region,
  outfile = NULL,
  append = FALSE,
  format = NULL,
  compress = NULL,
  qualityType = "FastqQuality",
  ...
)

## S3 method for class 'XStringSet'
extract_region(
  seq,
  positions,
  region,
  region2 = region,
  outfile = NULL,
  append = FALSE,
  format = NULL,
  compress = NULL,
  ...
)

## S3 method for class 'ShortRead'
extract_region(
  seq,
  positions,
  region,
  region2 = region,
  outfile = NULL,
  append = FALSE,
  format = NULL,
  compress = NULL,
  ...
)

## S3 method for class 'list'
extract_region(seq, positions, region, region2 = region, outfile = NULL, ...)

`seq`	(`character` scalar (a file name), an object belonging to several classes representing nucleotide sequences, a `character` vector of nucleotide sequences, or a `list` of any of these) the sequences to extract regions from. To extract from several files, use a `list` of single filenames, instead of a vector of filenames.
`positions`	(`data.frame`) as returned by `itsx` with `positions = TRUE` and `read_function` set; should have columns `$seq_id` with sequence IDs (matching those in `seq`), `$region` giving the name of each region, and `$start` and `$end` giving the start and stop location, if found, of each region.
`region`	(`character`) The region to extract. Should match a value given in `positions$region`.
`region2`	(`character`) If different from `region`, then the entire segment beginning at the start of `region` and ending at the end of `region2` will be extracted. For instance, to extract the entire ITS region, use `region = 'ITS1', region2 = 'ITS2'`.
`outfile`	(`character`) If given, the output will be written to the filename given in fasta or fastq format. The format is determined by `seq`, not by the extension of `outfile`.
`append`	(`logical` scalar) if `TRUE`, then data is appended to `outfile`; if `FALSE`, existing data in `outfile` is overwritten.
`format`	(`character`) File format to write (if `outfile` is given). Default is to guess based on ".fasta[.gz]" or ".fastq[.gz]" extension.
`compress`	(`logical` scalar or NULL) Whether to gz-compress `outfile`. Default is to detect based on presence of ".gz" extension.
`...`	Passed to methods.
`qualityType`	(`character` scalar) fastq file quality encoding; see `readFastq`.

(object of the same class as seq, or if seq is a filename, XStringSet-class or QualityScaledXStringSet-class depending on the format of the file) The requested region from each of the input sequences where it was found.

brendanf/tzara documentation built on March 11, 2021, 5:40 a.m.