Description Usage Arguments Details Value Author(s) References See Also Examples
Import and process single-end or paired-end alignments in a BAM/SAM/BED file to retain valid alignments defined by the arguments below. Multihits (same read mapped to multiple loci) are flagged for the subsequent disambiguation with function disambiguateMultihits
). The final output is a GAlignments object.
1 2 3 4 |
alignFilePath |
Path to the alignment file. |
format |
The alignmnet format can be determiend automatically from the file extension or specified by the user. The supported formats are BAM, SAM, and BED. |
genomeBuild |
Genome build used to obtain the chromosome information from online UCSC database in order to construct GAlignments object. Since the BAM/SAM header provides the chromosome information, the argument needs to be set only in the absence of the header information for some BAM/SAM files or when BED file is used. Examples for the common |
deleteGeneratedBAM |
Binary indicator to indicate whether the converted BAM from the original SAM input file needs to be deleted from the local disk (Default: FALSE). |
reverseComplement |
Binary indicator to indicate whether the reads were sequenced from the opposite strand of the original RNA molecule. |
returnDuplicate |
Indicator (TRUE, FALSE, NA) to instruct whether the duplicate alignmnets need to be returned (Default: FALSE). Duplicate reads are a set of reads that align to exactly the same genomic coordinate. Because transcripts are usually hundreds or thousands of base pairs long and thus much longer than the read (25-100 nt), the chance that the same 25-100 nt portion of the transcript being sequenced multiple times is very small and may very likely be due to PCR artifact. This argument is acutally passed to 'isDuplicate' in |
flagMultiHits |
Binary indicator for whether to add additional binary column named "uniqueHits" to indicate whether the corresponding aligned reads are unique hit (uniqueHits==TRUE) or multihit (uniqueHits==FALSE). Multihits represent multiple alignments of the same read due to gene duplications or repetitive elements of the genome. The multhits typically constitute a substantial proportion of the total mapped reads. Rather than being removed, these multihits are flagged ( |
returnOnlyUniqueHits |
Binary indicator to return only the unique hits and discard all of the multihits (Default: FALSE). |
paired |
Binary indicator to indicate whether the alignments are paired-end (Default: FALSE). For paired-end alignments, properly paired reads are combined into a single alignment record making use of the CIGAR flag āNā to indicate the number of bases between the mate pairs (i.e., the length of the insert fragment). In other words, the paired-end alignments are treated as gapped alignments of long fragments (See |
... |
Extra arguments are ignored. |
The BAM file is imported using readGAlignments
for single-end or readGAlignmentPairs
for paired-end alignments. The SAM file is converted to BAM first and then imported as above. The BED file is first imported by import
as GRanges object and subsequently converted to GAlignments
via the constructor function GAlignments
.
alignGal |
GAlignments object containning the processed alignments with the |
Yue Li
P. Aboyoun, H. Pages and M. Lawrence. GenomicRanges: Representation and manipulation of genomic intervals. R package version 1.8.9.
Michael Lawrence, Vince Carey and Robert Gentleman. rtracklayer: R interface to genome browsers and their annotation tracks. R package version 1.16.3.
combineAlignGals, readGAlignments, readGAlignmentPairs, import
1 2 3 4 5 6 7 8 | # Retrieve system files
extdata.dir <- system.file("extdata", package="RIPSeeker")
bamFiles <- list.files(extdata.dir, ".bam$", recursive=TRUE, full.names=TRUE)
bamFiles <- grep("PRC2", bamFiles, value=TRUE)
alignGal <- getAlignGal(bamFiles[1], reverseComplement=TRUE, genomeBuild="mm9")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.