Description Usage Arguments Value Author(s) Examples
View source: R/col_agrecounter_v1.R
This function allows to aggregates and summarize complex data frames with preserving names of the columns and providing the number of occurrences column "_count".
1 | col_agrecounter(inputDF, col_names, col_collapse , rows_collapse, control_col)
|
inputDF |
data frame for aggregation and counting (A&C) |
col_names |
column names which will be A&C |
col_collapse |
landmark column for columns which will A&C |
rows_collapse |
vector of characters<- content of the landmark column (col_collapse=). It can be easy generated using function wizbionet::col_to_string() |
control_col |
additional landmark column which doesn't allow to deduplicate data based on column col_collapse=. For example when we have different transcripts for a gene and we want to preserve this information |
output provides input data frame with additional columns in which rows were aggregated. For each aggregated column it provides two columns with suffixes "_coll" and "_count". Column with suffix "_coll" has aggregated strings (gene) for example: "CDH24|ONECUT2|PAX1|PTGER3". Column with suffix "_count" has numbers of items from column "_coll". Output is sorted based on the counts from the control_col= . After using this function you can try to use clusterizer_OneR() or top_percent() on col_collapse= and column with suffix _COUNT
Zofia Wicik
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | #Example
dat1<- data.frame(
mature_miRNA=c('hsa-miR-195-5p', 'hsa-miR-195-3p','hsa-miR-195-5p', 'hsa-miR-195-5p',
'hsa-miR-4753-5p', 'hsa-miR-4753-3p'),
pre_miRNA=c('hsa-miR-195', 'hsa-miR-195','hsa-miR-195', 'hsa-miR-195',
'hsa-miR-4753', 'hsa-miR-4753'),
Target=c('CDH24', 'PAX1', 'PTGER3', 'ONECUT2', 'TGFB3', 'FGFR1'))
#set parameters####
#dataframe for aggegation and counting (A&C)
inputDF<-dat1
#selected column names which will be A&C
col_names <- c( "Target")
#landmark column for A&C columns,other columns will be aggregated based on this
col_collapse <- "pre_miRNA"
#vector of content of the landmark column "col_collapse" as vector.
#You can use internal col_to_string() function
rows_collapse <-col_to_string(inputDF$pre_miRNA)
#additional landmark column which will not allow to deduplicate.
#Useful when you want to analyze pre-miRNAs instead of mature mirNAs
control_col<- "mature_miRNA"
#run function
output<- col_agrecounter(inputDF, col_names, col_collapse , rows_collapse, control_col)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.