geva.input.correct: GEVA Input Post-processing

Description Usage Arguments Details Value Examples

View source: R/input.R

Description

Helper functions used to edit the contents from a GEVAInput.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
geva.input.correct(ginput, na.rm = TRUE, inf.rm = TRUE, invalid.col.rm = TRUE)

geva.input.filter(
  ginput,
  p.value.cutoff = 0.05,
  by.any = FALSE,
  na.val = 0,
  ...
)

geva.input.rename.rows(
  ginput,
  attr.column,
  dupl.rm.method = c("least.p.vals", "order")
)

Arguments

ginput

A GEVAInput object

na.rm

logical; if TRUE, removes all rows containing NA

inf.rm

logical; if TRUE, removes all rows containing infinite values (Inf or -Inf)

invalid.col.rm

logical; if TRUE, searches for any column that is entirely composed by invalid values (according to the other arguments) and removes it before checking the rows

p.value.cutoff

numeric (0 to 1), the p-value cutoff. Rows containing values above this threshold are removed

by.any

logical, set to TRUE to delete the rows with at least one occurrence above the cutoff; or FALSE to delete only those rows in which all values are above the specified threshold

na.val

numeric, the replacement for NA values

...

optional arguments. Accepts verbose (logical, default is TRUE) to enable or disable printing the progress

attr.column

character, target column with the values that will replace the current row names

dupl.rm.method

character, method to remove duplicate names. The possible options are:

  • "least.p.vals" : Keeps the duplicate that contains the least sum of p-values

  • "order" : Keeps the first occurrence of the duplicate in the current row order

Details

geva.input.correct corrects the numeric input data (values and weights), removing rows that include invalid values such as NA or infinite.

geva.input.filter attempts to select the most relevant part of the input data, removing rows containing p.values (1 - weights) above a specific threshold.

geva.input.rename.rows replaces the row names with a column from the feature table (see GEVAInput). The column name specified for attr.column must be included in the names(featureTable(ginput)). Any duplicates are removed according to the dupl.rm.method, and the selected duplicates are stored as a new column named "renamed_id" inside the feature table from the returned object.

Value

A modified GEVAInput object

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## geva.input.correct example
colexample1 <- runif(1000, -1, 1)        # Random column 1
colexample2 <- runif(1000, -1, 1)        # Random column 2
colexample3 <- runif(1000, -1, 1)        # Random column 3
colexample3[runif(1000, -1, 1) < 0] = NA # Random NA's
ginput = geva.merge.input(col1=colexample1,
                          col2=colexample2,
                          col3=colexample3)
# Before the correction:
print(nrow(ginput))    # Returns 1000
# Applies the correction (removes rows with NA's)
ginput <- geva.input.correct(ginput)
# After the correction:
print(nrow(ginput))    # Returns less than 1000

## ---
## geva.input.filter example
ginput <- geva.ideal.example(1000)  # Generates a random input
# Before the filter:
print(nrow(ginput))    # Returns 1000
# Applies the filter
ginput <- geva.input.filter(ginput)
# After the filter:
print(nrow(ginput))    # Returns less than 1000

## ---
## geva.input.rename.rows example
ginput <- geva.ideal.example()  # Generates a random input
# Renames to 'Symbol'
ginput <- geva.input.rename.rows(ginput,
                                 attr.column = "Symbol")
print(head(ginput))             # The row names are set now as the gene symbols

nunesijg/geva documentation built on March 12, 2021, 3:58 p.m.