This package contains only one function correctCounts as to assign multiply-mapped-reads to the most abundant gene that it mapped to.

library(correctMultiCount)

data(baseCount)
data(multiCount)

head(baseCount)
head(multiCount)

Noted that this function needed to two data frames with the exact column names as shown here.

df <- correctCounts(baseCount,multiCount)
head(df)

Now, lets check if it works as we thought.

library(dplyr)
compareDF <- df %>% 
  setNames(c('id','newCount')) %>%
  inner_join(baseCount) %>%   # merge the old and new count dataframe
  tbl_df

And find out the fragments that mapped to at least two locus

multiCount %>% 
  group_by(fragment_id) %>% 
  summarize(mapped_location_count = n()) %>% 
  filter(mapped_location_count > 1) %>%
  arrange(-mapped_location_count) %>%
  tbl_df

Lets check the third fragment

fragment = multiCount %>%
  filter(fragment_id=='NS500358:89:HJWK2BGXX:1:11101:22227:18777') %>%
  tbl_df
head(fragment)

fragment_mapped_gene <- fragment$gene_id
compareDF %>%
  filter(id %in% fragment_mapped_gene)

So here you see, the multiply-mapped gene is assign to the only gene locus that originally has count.



wckdouglas/correctMultiCount documentation built on May 4, 2019, 2:03 a.m.