In wckdouglas/correctMultiCount: Correcting multiply mapped counts

This package contains only one function correctCounts as to assign multiply-mapped-reads to the most abundant gene that it mapped to.

library(correctMultiCount)

data(baseCount)
data(multiCount)

head(baseCount)
head(multiCount)

Noted that this function needed to two data frames with the exact column names as shown here.

baseCount dataframe needed to have 2 columns named as
id
count
multiCount dataframe needed to have 2 columns named as
fragment_id
gene_id

df <- correctCounts(baseCount,multiCount)
head(df)

Now, lets check if it works as we thought.

library(dplyr)
compareDF <- df %>% 
  setNames(c('id','newCount')) %>%
  inner_join(baseCount) %>%   # merge the old and new count dataframe
  tbl_df

And find out the fragments that mapped to at least two locus

multiCount %>% 
  group_by(fragment_id) %>% 
  summarize(mapped_location_count = n()) %>% 
  filter(mapped_location_count > 1) %>%
  arrange(-mapped_location_count) %>%
  tbl_df

Lets check the third fragment

fragment = multiCount %>%
  filter(fragment_id=='NS500358:89:HJWK2BGXX:1:11101:22227:18777') %>%
  tbl_df
head(fragment)

fragment_mapped_gene <- fragment$gene_id
compareDF %>%
  filter(id %in% fragment_mapped_gene)