View source: R/analysis-functions.R
gene_frequency_fisher | R Documentation |
Provided 2 data frames with calculations for CIS, via CIS_grubbs()
,
computes Fisher's exact test.
Results can be plotted via fisher_scatterplot()
.
gene_frequency_fisher(
cis_x,
cis_y,
min_is_per_gene = 3,
gene_set_method = c("intersection", "union"),
onco_db_file = "proto_oncogenes",
tumor_suppressors_db_file = "tumor_suppressors",
species = "human",
known_onco = known_clinical_oncogenes(),
suspicious_genes = clinical_relevant_suspicious_genes(),
significance_threshold = 0.05,
remove_unbalanced_0 = TRUE
)
cis_x |
A data frame obtained via |
cis_y |
A data frame obtained via |
min_is_per_gene |
Used for pre-filtering purposes. Genes with a number of distinct integration less than this number will be filtered out prior calculations. Single numeric or integer. |
gene_set_method |
One between "intersection" and "union". When merging
the 2 data frames, |
onco_db_file |
Uniprot file for proto-oncogenes (see details). If different from default, should be supplied as a path to a file. |
tumor_suppressors_db_file |
Uniprot file for tumor-suppressor genes. If different from default, should be supplied as a path to a file. |
species |
One between |
known_onco |
Data frame with known oncogenes. See details. |
suspicious_genes |
Data frame with clinical relevant suspicious genes. See details. |
significance_threshold |
Significance threshold for the Fisher's test p-value |
remove_unbalanced_0 |
Remove from the final output those pairs in which there are no IS for one group or the other and the number of IS of the non-missing group are less than the mean number of IS for that group |
These files are included in the package for user convenience and are
simply UniProt files with gene annotations for human and mouse.
For more details on how this files were generated use the help
?tumor_suppressors
, ?proto_oncogenes
The default values are included in this package and it can be accessed by doing:
known_clinical_oncogenes()
If the user wants to change this parameter the input data frame must
preserve the column structure. The same goes for the suspicious_genes
parameter (DOIReference column is optional):
clinical_relevant_suspicious_genes()
A data frame
The function will explicitly check for the presence of these tags:
gene_symbol
Other Analysis functions:
CIS_grubbs()
,
HSC_population_size_estimate()
,
compute_abundance()
,
cumulative_is()
,
is_sharing()
,
iss_source()
,
sample_statistics()
,
top_integrations()
,
top_targeted_genes()
data("integration_matrices", package = "ISAnalytics")
data("association_file", package = "ISAnalytics")
aggreg <- aggregate_values_by_key(
x = integration_matrices,
association_file = association_file,
value_cols = c("seqCount", "fragmentEstimate")
)
cis <- CIS_grubbs(aggreg, by = "SubjectID")
fisher <- gene_frequency_fisher(cis$cis$PT001, cis$cis$PT002,
min_is_per_gene = 2
)
fisher
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.