importIPAenrichment: Import Ingenuity Pathway Analysis 'IPA' results
In jmw86069/jamenrich: Analysis and Visualization of Multiple Gene Set Enrichments

importIPAenrichment

R Documentation

Import Ingenuity Pathway Analysis 'IPA' results

Description

Import Ingenuity Pathway Analysis 'IPA' results

Usage

importIPAenrichment(
  ipaFile,
  headerGrep =
    "(^|\t)((expr.|-log.|)p-value|Pvalue|Score($|\t)|Symbol($|\t)|Ratio($|\t)|Consistency.Score|Master.Regulator($|\t))",
  ipaNameGrep = c("Pathway", "Regulator$", "Regulators", "Regulator", "Disease",
    "Toxicity", "Category", "Categories", "Function", "Symbol$", "^ID$",
    "My.(Lists|Pathways)"),
  geneGrep = c("Molecules in Network", "Target molecules", "Molecules", "Symbol"),
  geneCurateFrom = c("[ ]*[(](complex|includes others)[)][ ]*", "^[, ]+|[, ]+$"),
  geneCurateTo = c("", ""),
  method = 1,
  sheet = 1,
  sep = "\t",
  xlsxMultiSheet = TRUE,
  useXlsxSheetNames = FALSE,
  remove_blank_colnames = TRUE,
  convert_ipa_slash = TRUE,
  ipa_slash_sep = ":",
  revert_ipa_xref = TRUE,
  verbose = FALSE,
  ...
)

Arguments

`ipaFile`	one of the four input types described above: a character vector of text file names; a character vector of Excel `.xlsx` file names; a list of `data.frame` objects.
`headerGrep`	regular expression pattern used to recognize header columns found in Ingenuity IPA enrichment data.
`ipaNameGrep`	vector of regular expression patterns used to recognize the name of the enriched entity, for example the biological pathway, or network, or disease category, etc.
`geneGrep`	regular expression pattern used to recognize the column containing genes, or the molecules tested for enrichment which were found in the enriched entity.
`geneCurateFrom`, `geneCurateTo`	vector of patterns and replacements, respectively, used to curate values in the gene column. These replacement rules are used to ensure that genes are delimited consistently, with no leading or trailing delimiters.
`method`	integer value indicating the method used to import data from a text file, where: `method=1` uses `data.table::read.table()` and the `textConnection` argument; `method=2` uses `readr::read_tsv()`. The motivation to use `data.table::read.table()` is it performed better in the presence of UTF-8 characters such as the alpha symbol.
`sheet`	integer value used only when `ipaFile` is a vector of Excel `.xlsx` files, and when the Excel format includes multiple worksheets. This value will extract enrichment data only from one worksheet from each Excel file.
`sep`	character string used when `ipaFile` is a vector of text files, to split fields into columns. The default will split fields by the tab character.
`xlsxMultiSheet`	logical indicating whether input Excel `.xlsx` files contain multiple worksheets.
`useXlsxSheetNames`	logicl indicating whether to use the Excel worksheet name for each imported enrichment table, when importing from `.xlsx` files, and when `xlsxMultiSheet=FALSE`. When `xlsxMultiSheet=TRUE` the name is derived from the value matched using `ipaNameGrep`, because in this case, there are expected to me multiple enrichment tables in one worksheet.
`remove_blank_colnames`	`logical` indicating whether to drop `colnames()` where all values are contained in `c(NA, "")`. This option may be preferable `remove_blank_colnames=FALSE` when all values in some column like `zScore` are `NA`, but you would still like to retain the column for consistency with other data. We found that IPA does not report `zScore` values when there are only 4 or fewer genes involved in each enrichment result.
`convert_ipa_slash`	`logical` indicating whether to convert IPA gene naming conventions, currently some genes are considered one entity in the IPA system, for example `"HSPA1A/HSPA1B"` is considered one gene, even though two Entrez gene entries `"HSPA1A"` and `"HSPA1B"` can be represented. Regardless whether one or both genes are provided to IPA, it considers it one entity for the purpose of pathway enrichment hypergeometric testing. Unfortunately, the forward slash `"/"` is also used by `clusterProfiler` object `enrichResult` as gene delimiter, and is hard-coded and cannot be changed. So it will automatically consider `"HSPA1A/HSPA1B"` as two genes, causing a mismatch with the IPA results. When `convert_ipa_slash=TRUE` by default, it converts the forward slash `"/"` to the value of argument `ipa_slash_sep`.
`ipa_slash_sep`	`character` string used as a delimited when `convert_ipa_slash=TRUE`, used to replace genes that contain forward slash `"/"` to use another character.
`revert_ipa_xref`	`logical` indicating whether to revert the IPA gene symbols reported, which requires that the IPA data contains a section `"Analysis Ready Molecules"`.
`verbose`	logical indicating whether to print verbose output.
`...`	additional arguments are ignored.

Details

This function parses Ingenuity Pathway Analysis ('IPA') enrichment data into a list of enrichment data.frame objects for downstream analysis. Each data.frame represents the results of one Ingenuity IPA section, however not all sections represent statistical results.

Motivation

Separate multiple IPA enrichment tables.
Rename colnames to be consistent.
Revert IPA gene aliases to original user input (optional).

Input format

ipaFile can be a text .txt file, where the text file contains all IPA enrichment data in tall format. This format is most common.
ipaFile can be an Excel .xlsx file, which contains all IPA enrichment data in one tall worksheet tab.
ipaFile can be an Excel .xlsx file, where each type of IPA enrichment appears on a separate Excel worksheet tab.
ipaFile can be a list of data.frame objects. This option is intended when the IPA data has already been imported into R as separate data.frame objects.

Notes

When using "Export All" from 'IPA', the default text format includes multiple enrichment tables concatenated together in one file. Each enrichment table contains its own unique column headers, with descriptive text in the line preceding the column headers. This function is intended to separate the enrichment tables into a list of data.frame objects, and retain the descriptive text as names of the list.

Troubleshooting

A common error occurs when reverting IPA gene symbols to the original user-supplied identifier, by default revert_ipa_xref=TRUE. For errors during this step, consider revert_ipa_xref=FALSE which will retain the gene symbol as recognized by IPA. The downside of this approach is that it may be more difficult to equate to the input identifier. In that case look at the "Analysis Ready Molecules" data.frame which should contain the user-provided values as "ID"; the IPA recognized symbol as "Name", and optionally a column "Symbol" which is edited by multienrichjam.

Value

list of data.frame objects, where each data.frame contains enrichment data for one of the Ingenuity IPA enrichment tests.

jmw86069/jamenrich
Analysis and Visualization of Multiple Gene Set Enrichments

importIPAenrichment: Import Ingenuity Pathway Analysis 'IPA' results
In jmw86069/jamenrich: Analysis and Visualization of Multiple Gene Set Enrichments

Import Ingenuity Pathway Analysis 'IPA' results

Description

Usage

Arguments

Details

Motivation

Input format

Notes

Troubleshooting

Value

See Also

Related to importIPAenrichment in jmw86069/jamenrich...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/jamenrich Analysis and Visualization of Multiple Gene Set Enrichments

importIPAenrichment: Import Ingenuity Pathway Analysis 'IPA' results In jmw86069/jamenrich: Analysis and Visualization of Multiple Gene Set Enrichments

Import Ingenuity Pathway Analysis 'IPA' results

Description

Usage

Arguments

Details

Motivation

Input format

Notes

Troubleshooting

Value

See Also

Related to importIPAenrichment in jmw86069/jamenrich...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/jamenrich
Analysis and Visualization of Multiple Gene Set Enrichments

importIPAenrichment: Import Ingenuity Pathway Analysis 'IPA' results
In jmw86069/jamenrich: Analysis and Visualization of Multiple Gene Set Enrichments