Description Usage Arguments Value Author(s) Examples
This function enables comparison of data sets of different length. It is suggested to use it on gene lists which have associated numeric values. It is an alternative for clusterizer_oneR which doesn't deal well with continuous numbers like p-values or fold changes Prioritization of the analyzed gene lists can based on the scores assigned after data aggregation and counting. This function helps to avoid arbitrary selection of top candidates, subsetting top percent of genes for a given cutoff. It includes all genes close to a cutoff if they have same value. It generates new column with TRUE or FALSE value giving information if our gene was present in the top percents.
1 2 | top_percent(inputDF,
landmark_col, cols_to_cluster, cutoff)
|
inputDF |
input data frame, need to have at least two columns landmark_col= and cols_to_cluster= |
landmark_col |
column from the input DF we want to analyze for example column with gene symbols (characters) |
cols_to_cluster |
column or multiple columns from the input DF with numeric scores (counts), for example number of regulatory miRNAs for each gene, number of data sets the gene was present in |
cutoff |
percent of top hits which should be selected, default is set as 25 percent |
output column clus_... - logical information if gene was present in top percent cutoff. Name includes information about cutoff value
Zofia Wicik
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | #example####
#create input DF called DE_miRNA
miR<-c('hsa-miR-497-5p', 'hsa-miR-106a-5p', 'hsa-miR-195-5p', 'hsa-miR-4753-3p',
'hsa-miR-493-5p', 'hsa-miR-450b-5p', 'hsa-miR-448', 'hsa-miR-1264', 'hsa-miR-541-5p',
'hsa-miR-449b-5p', 'hsa-miR-493-3p', 'hsa-miR-4731-3p', 'hsa-miR-106a-3p', 'hsa-miR-345-5p',
'hsa-miR-3612', 'hsa-miR-1343', 'hsa-miR-1197', 'hsa-miR-1229-3p', 'hsa-miR-4766-3p',
'hsa-miR-580-3p', 'hsa-miR-345-3p', 'hsa-miR-4714-5p')
values_A<- c(66, 62, 54, 40, 34, 32, 32, 16, 15, 15, 15, 14, 14, 9,
9, 9, 9, 8, 5, 5, 4, 1)
values_B<- c(3, 5, 12, 14, 7, 7, 7, 1, 1, 13, 20, 12, 15,
6, 2, 2, 1, 12, 21, 10, 13, 3)
DE_miRNA<- data.frame(miR,values_A,values_B)
#set parameters
inputDF<- DE_miRNA
name_input_df="DE_miRNA"
landmark_col<- "miR"
cols_to_cluster<- c('values_A', 'values_B')
cutoff=20
#run function
temp<- top_percent(inputDF, landmark_col, cols_to_cluster, cutoff)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.