HISAT2 {data-navmenu="Aligning"}

hisat <- fread(paste0("grep -H '' ",
                      paste(type.list$Hisat2, collapse=" "),
                      " | perl -F':|\\(|\\)|(>?\\d\\stimes?)' -lanE '/%/ and /aligned/ ? say join qq{\t}, @F : next'"),
               header = FALSE)
hisat <- hisat[, c(1, 3, 5, 7, 8)][,
                                   ':=' (V5 =  as.numeric(sub("%", "", V5, fixed = TRUE)) / 100,
                                         V7 = sub(' exactly', "", V7))][, V1 := tstrsplit(V1, .Platform$file.sep)[[length(tstrsplit(V1, .Platform$file.sep))]]][, V1 := tstrsplit(V1, '[._]')[[1]]]
setnames(hisat, c('Sample', 'Num', 'Percentage', 'Type', 'Times'))
fig.height <- length(unique(as.vector(t(hisat[,1])))) * 0.7

Column

HISAT2 percentage plot

p <- ggplot() +
  geom_bar(data = hisat, aes(x = Type, y = Percentage, fill = Times), stat = 'identity') +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() + facet_grid(Sample ~ .) + get(paste0('scale_fill_',params$theme))()
save_plot('HISAT2.tiff', p, base_height = fig.height, base_width = 11, dpi = 300, compression = 'lzw')
save_plot('HISAT2.pdf', p, base_height = fig.height, base_width = 11, dpi = 300)
ggplotly(p) %>% layout(margin = list(b = 60))

Column

Description

This section summarizes HISAT2 mapping statistics of reads from multiple samples.

A typical output of Hisat2 Alignment Summary can be found here

Hisat2 mapping statistics summarize read alignments that either mapped or did not map to the reference genome. Mapped reads are further classified into discordantly mapped, unique mapped, multiple mapped or mapped with low-quality. This kind of information is necessary for evaluating the library quality and sequencing-run. When multiple samples are involved in analysis, this overview can help in detecting batch effects or outlier samples.

Note: Depending on the choice of aligner set in the parameters of LncPipe, this will result in different kinds of summary reports. LncPipeReporter can also automatically determine the aligner from file type. Therefore, LncPipeReporter can also be run separately based on user-choice of aligner.

HISAT2 log table (Only the first 80 values are shown)

fwrite(hisat, 'HISAT2.csv')
DT::datatable(head(hisat, n = 80L)) %>% DT::formatRound('Percentage', digits = 2)
rm(hisat)


bioinformatist/LncPipeReporter documentation built on May 28, 2019, 7:11 p.m.