animalcules_id: Animalcules ID

Description Usage Arguments Value Examples

Description

This function will read in a .bam file, annotate the taxonomy and genome names, reduce the mapping ambiguity using a mixture model, and output a .csv file with the results. Right now, it assumes that the genome library/.bam files uses NCBI accession names for reference names (rnames in .bam file).

Usage

1
2
3
animalcules_id(bam_file,
  out_file = paste(tools::file_path_sans_ext(bam_file),
  ".animalculesID.csv", sep = ""), EMconv = 0.001, EMmaxIts = 50)

Arguments

bam_file

The .bam file that needs to be summarized, annotated, and needs removal of ambiguity

out_file

The name of the .csv output file. Deacults to the bam_file basename plus "animalculesID.csv"

EMconv

The convergence parameter of the EM algorithm. Default set at 0.001

EMmaxIts

The maximum number of EM iterations, regardless of whether the EMconv is below the threshhold. Default set at 50. If set at 0, the algorithm skips the EM step and summarizes the .bam file 'as is'.

Value

This function returns a .csv file with annotated read counts to genomes with mapped reads. The function iself returns the output .csv file name.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Get a reference genome library
download_refseq('viral', compress = FALSE)

## Make and align to a single a reference genome library
mk_subread_index('viral.fasta')
readPath <- system.file("extdata", "virus_example.fastq", package = "animalcules.preprocess")
viral_map <- align_target( readPath, "viral", "virus_example")

#### Apply animalcules ID:
animalcules_id( viral_map )

wevanjohnson/animalcules.preprocess documentation built on May 11, 2019, 8:26 p.m.