PAC_mapper: Advanced sequence mapping of a PAC object

View source: R/PAC_mapper.R

PAC_mapperR Documentation

Advanced sequence mapping of a PAC object

Description

PAC_mapper Map sequences against a small reference.

Usage

PAC_mapper(
  PAC,
  ref,
  mismatches = 0,
  multi = "remove",
  threads = 1,
  N_up = "",
  N_down = "",
  report_string = FALSE,
  override = FALSE
)

Arguments

PAC

PAC-list object.

ref

Character indicating the path to the fasta (.fa) reference file or a DNAStringSet with already loaded reference sequences. If a Bowtie index is missing for the reference, PAC_mapper will attempt to temporarily generate such index automatically. Thus, large references are discouraged. Instead, we suggest you use the original reanno workflow for large references.

mismatches

Integer indicating the number of mismatches that should be allowed in the mapping.

multi

Character indicating how to deal with multimapping. If multi="keep", query sequences that maps multiple times to the same reference sequence will be reported >1 times in the output (indicated by .1, .2, .3 etc. in the reported sequence name). If multi="remove" (default), then all multimapping sequences will be removed, resulting in 1 row for each query sequence that maps to the target reference sequence. The function will always give a warning if a query sequence maps to multiple sites within a reference sequence. However, this function discriminate multimapping only within a reference sequence. Thus, if the fasta input contains multiple reference sequences, a query sequence may be reported in multiple references sequences.

threads

Integer indicating the number of parallel processes that should be used.

N_up

Character indicating a sequence that should be added to the reference at the 5' end prior to mapping. A wild card nucleotides "NNN" (any of C, T, G, A) can for example be added for mapping non-perfect reference hits. No nucleotides are added by default.

N_down

Character. Same as N_up but indicating a sequence that should be added to the reference at the 3' end. Useful for tRNA analysis where the reference do not contain pre-processed tRNA. Setting N_down="NNN" or "CCA" (in many species CCA is added to mature tRNA) will allow mapping against the mature tRNA. No nucleotides are added by default.

report_string

Logical whether an alignment string that shows where sequences align against the reference in a character format. Works well with tRNA, but makes the Alignments object difficult to work with when longer references are used (default=FALSE).

override

Logical whether or not the map_reanno function should prompt you for a question if there are files in the temporary path. As default, override=FALSE will prevent deleting large files by accident, but requires an interactive R session. Setting override=TRUE may solve non-interactive problems.

Details

Given a PAC object and the path to a fasta reference file, this function will map sequences in the PAC using a 'backdoor' into the reanno workflow.

Value

Stacked list, where each object on the highest level contains: (Object 1) Reference name and sequence. (Object 2) Data.frame showing the mapping results of each query sequence that mapped to Object 1.

See Also

https://github.com/Danis102 for updates on the current package.

Other PAC analysis: PAC_covplot(), PAC_deseq(), PAC_filter(), PAC_filtsep(), PAC_gtf(), PAC_jitter(), PAC_nbias(), PAC_norm(), PAC_pca(), PAC_pie(), PAC_saturation(), PAC_sizedist(), PAC_stackbar(), PAC_summary(), PAC_trna(), as.PAC(), filtsep_bin(), map_rangetype(), tRNA_class()

Examples


###########################################################
### Simple example of how to use PAC_mapper 
# Note: More details, see vignette and manuals.)
# Also see: ?map_rangetype, ?tRNA_class or ?PAC_trna, ?PAC_covplot 
# for more examples on how to use PAC_mapper.

## Load PAC-object, make summaries and extract rRNA and tRNA
 load(system.file("extdata", "drosophila_sRNA_pac_filt_anno.Rdata", 
                   package = "seqpac", mustWork = TRUE))

pac <- PAC_summary(pac, norm = "cpm", type = "means", 
                   pheno_target=list("stage", unique(pheno(pac)$stage)))
                   
pac_rRNA <- PAC_filter(pac, anno_target = list("Biotypes_mis0", "rRNA"))

## Give paths to a fasta reference (with or without bowtie index)
#  (Here we use an rRNA/tRNA fasta included in seqpac) 

ref_rRNA <- system.file("extdata/rrna", "rRNA.fa", 
                         package = "seqpac", mustWork = TRUE)
                         

## Map using PAC-mapper
map_rRNA <- PAC_mapper(pac_rRNA, mismatches=0, 
                        threads=1, ref=ref_rRNA, override=TRUE)


Danis102/seqpac documentation built on Aug. 26, 2023, 10:15 a.m.